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TITLE: Three Dimensional Structure of a Sterile Alpha Motif Domain 
FIELD OF THE INVENTION 

The invention relates to the three dimensional structure of a sterile alpha motif (Sam) 
domain. The atomic coordinates that define the structure and any compounds bound to the structure 
enable the determination of homologues, the three dimensional structures of polypeptides with 
unknown structure, and the identification of modulators of a SAM domain. 
BACKGROUND OF THE INVENTION 

The Eph family of receptor tyrosine kinases have been implicated in the control of axon 
guidance [Henkemeyer, 1996; Orioli, 1996], cell migration [Krull, 1997], patterning of the nervous 
system [Xu, 1996] and angiogenesis [Wang, 1998], and are activated by clustering into dimers or 
tetramers [Stein, 1998]. However, the cell-surface ligands for Eph receptors (ephrins) apparently lack 
an intrinsic ability to induce receptor oligomerization [Lackmann, 1997]. Factors that influence 
receptor aggregation include the pre-ciustering of ephrins [Davis, 1994], the homotypic interaction 
between the extracellular domains of two receptor chains [Lackmann, 1998], and the binding of PDZ 
domain containing proteins to the receptor's C-terminus [Hock, 1998]. 

All Eph receptors have a Sterile Alpha Motif (SAM) domain within their cytoplasmic 
regions. The SAM domain was identified as a conserved sequence present in a small set of yeast 
sexual differentiation proteins referred to as the Sterile Alpha Mating factors [Ponting, 1995; Schultz, 
1997], In ETS family transcription factors this sequence has also been termed the Pointed domain 
[Klambt, 1993]. The domain is found in a variety of proteins, many of which contain catalytic 
domains or recognized protein interaction domains. SAM domains are almost always located at a 
protein's N- or C-terminus. A highly conserved SAM domain is located in the cytoplasmic region of 
Eph receptors (approx. 50 % identity over 14 family members), C-terminal to the catalytic domain 
and followed by only 5 residues that form a potential PDZ domain binding site [Hock, 1998], 
Amongst receptor tyrosine kinases, the presence of a cytoplasmic module other than the protein 
kinase domain is unique to Eph receptors. 

The SAM domain can function as a protein interaction module through an ability to homo- 
and hetero-dimerize with other SAM domains [Jousset, 1997; Peterson, 1997; Tu, 1997; Kyba, 1998]. 
This dimerizing property elicits oncogenic activation of chimeric proteins arising from translocation 
of the SAM domain of TEL to coding regions of the PPDGF receptor [Golub, 1994); Abl [Golub, 
1996], and JAK2 protein kinases [Lacronique, 1997] or the AML1 transcription factor [Golub, 1995], 
A functional role in mediating homo and hetero-typic dimerization has been shown for SAM domains 
in the transcription factor TEL [Jousset, 1997], members of the polycomb group of transcriptional 
repressors (RAE28, Scm and ph) [Peterson, 1997], the protein kinase Byr2p [Tu, 1997], and the a 
and P isoforms of the liprin scaffolding proteins [Serra-Pages, 1998]. 
SUMMARY OF THE INVENTION 

Broadly stated, the present invention relates to the three-dimensional structure of one or 
more SAM domains. The three-dimensional structures may be complexed with one or more 
compounds. The defined boundaries and properties of the structures and any of the compounds bound 
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to it are pertinent to methods for determining the three-dimensional structures of polypeptides with 
unknown structure, and to methods that identify modulators of SAM domain function. These 
modulators are potentially useful as therapeutics for diseases, including (but not limited to) cell 
proliferative diseases, such as cancer, angiogenesis, atherosclerosis, and arthritis, and diseases 
associated with the nervous system. 

Broadly stated the present invention relates to a crystalline form of a polypeptide 
corresponding to one or more SAM domains, preferably one or more SAM domains of an Eph 
receptor, preferably of EphA. The crystalline form may comprise one or more heavy metal atoms, or 
at least one compound. In a preferred embodiment, a unit cell of the crystalline form of the invention 
has dimensions of about a=b= 77.14 ± .03 angstroms, c= 24.3 ± .04 angstroms. 

The invention also relates to a method of forming a crystalline form of the invention 
comprising 

(a) mixing a volume of a SAM domain with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed 
container under conditions suitable for crystallization. 

The invention also features a method of determining three dimensional structures of 
polypeptides with unknown structure comprising the step of applying the structural atomic 
coordinates of a crystalline form of one or more SAM domains of the invention. 

Methods are also provided for identifying a potential modulator of a SAM domain function 
preferably a SAM domain of an Eph receptor function by docking a computer representation of a 
structure of a compound with a computer representation of a structure of one or more SAM domains 
of the invention preferably a SAM domain of an Eph receptor that is defined by the atomic structural 
coordinates described herein. In an embodiment the method comprises the following steps: 

(a) docking a computer representation of a compound from a computer data base with 
a computer representation of a selected site on a SAM domain, preferably a SAM 
domain of an Eph receptor, to obtain a complex; 

(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of 
SAM domain function. 

In another embodiment the method comprises the following steps: 

(a) modifying a computer representation of a compound complexed with a selected site 
on a SAM domain, preferably a SAM domain of an Eph receptor, by deleting or 
adding a chemical group or groups; 

(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying a compound that best fits the selected site as a potential modulator of a 
SAM domain. 

In still another embodiment the method comprises the following steps: 
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(a) 



selecting a computer representation of a compound complexed with a selected site 
on a SAM domain, preferably a SAM domain of an Eph receptor; and 
searching for molecules in a data base that are similar to the compound using a 
searching computer program, or replacing portions of the compound with similar 
chemical structures from a data base using a compound building computer 
program. 



(b) 



The invention also features a potential modulator of a function of a SAM domain preferably 
a SAM domain of an Eph receptor identified by the methods of the invention, and a method of 
treating a disease associated with a SAM domain preferably a SAM domain of an Eph receptor with 
inappropriate activity in a cellular organism, comprising: 



The invention also provides peptides that mediate SAM domain function. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention will now be described in relation to the drawings in which: 

Figure 1 A shows a sequence alignment of SAM domains from selected proteins (SEQ. ID. 

NOS. 1 to 21); 

Figure IB shows a selection of multi-domain proteins containing SAM domain (S); 
Figure 2 A is a ribbons depiction of the SAM homo-dimer viewed down the twofold 
symmetry axis; 

Figure 2B is a ribbons depiction of the SAM homo-dimer viewed perpendicular to the 
symmetry axis; 

Figure 2C is a ribbons stereo view highlighting the dimer interface region; 
Figure 3A is a molecular surface and worm representation of the SAM homodimer; 
Figure 3B is a molecular surface and worm representation of the SAM homodimer; and 
Figure 4 is a gel filtration elution profile of wild type and single or double site mutants of the 
EphA4 receptor SAM domain. 

DETAILED DESCRIPTION OF THE INVENTION 
DEFINITIONS: 

Unless otherwise indicated, all terms used herein have the same meaning as they would to 
one skilled in the art of the present invention. Practitioners are particularly directed to Current 
Protocols in Molecular Biology (Ansubel) for definitions and terms of the art. 

Abbreviations for amino acid residues are the standard 3-letter and/or 1 -letter codes used in 
the art to refer to one of the 20 common L-amino acids. Likewise abbreviations for nucleic acids are 
the standard codes used in the art. 

The term "crystalline form" in the context of the invention, is a crystal formed from an 
aqueous solution comprising a purified polypeptide comprising one or more SAM domains, 
preferably a SAM domain of an Eph receptor. A crystalline form of a SAM domain is characterized 



(a) administering a modulator identified using the methods of the invention in an 
acceptable pharmaceutical preparation; and 

(b) activating or inhibiting a SAM domain function to treat the disease. 
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as being capable of diffracting x-rays in a pattern defined by one of the crystal forms depicted in 
Blundel et al 1976, Protein Crystallography, Academic Press. A crystalline form may include a 
crystal structure in association with one or more heavy-metal atoms i.e. a derivative crystal, or a 
crystal structure in association with one or more compounds i.e. a co-crystal. 

The term "association" refers to a condition of proximity between a chemical entity or 
compound or portions or fragments thereof, and a SAM domain or portions or fragments thereof. The 
association may be non-covalent i.e. where the juxtaposition is energetically favored by for example, 
hydrogen-bonding, van der Waals, or electrostatic or hydrophobic ineractions, or it may be covalent. 

The term "heavy-metal atoms" refers to an atom that is a transition element, a lanthanide 
metal, or an actinide metal. Lanthanide metals include elements with atomic numbers between 57 and 
71, inclusive. Actinide metals include elements with atomic numbers between 89 and 103, inclusive. 

The term "Eph receptor" refers to a subfamily of closely related transmembrane receptor 
tyrosine kinases related to Eph, a receptor named for its expression in an erythropoietin-producing 
human hepatocellular carcinomas cell line. The receptors contain cell adhesion-like domains on their 
extracellular surface. The Eph subfamily receptor tyrosine kinases are more specifically characterised 
as encoding a structurally related cysteine rich extracellular domain containing a single 
immunoglobulin (Ig)-like loop near the N-terminus and two fibronectin III (FN III) repeats adjacent 
to the plasma membrane. The Eph receptors are divided into two groups based on the relatedness of 
their extracellular domain sequences. The grouping also corresponds to the ability of the receptors to 
bind preferentially to the ephrin-A or ephrin-B proteins. The group that includes receptors interacting 
preferentially with ephrin A proteins is called EphA and includes EphAl (also known as Eph and 
Esk), EphA2 (also known as Eck, Myk2, Sek2), EphA3 (also known as Cek4, Mek4, Hek, Tyro4, 
Hek4)> EphA4 (also known as Sek, Sekl, Cek8, Hek8, Tyrol), EphA5 (also known as Ehkl, Bsk, 
Cek7, Hek7, and Rek7), EphA6 (Ehk2, and Hekl2) EphA7 (also known as Mdkl, Hekl 1, Ehk3, Ebk, 
Cekll), and EphA 8 (also known as Eek, Hek3). The group that includes receptors interacting 
preferentially with ephrin B proteins is called Eph B and includes EphBl (also known as Elk, Cek6, 
Net, Hek6), EphB2 (also known as Cek5, Nuk, Erk, Qek5, TyroS, Sek3, hek5, Drt), EphB3 (also 
known as CeklO, Hek2, Mdk5, Tyro6, and Sek4), EphB4 (also known as Htk, Mykl, Tyrol 1, Mdk2), 
EphB5 (also known as Cek9, Hek9), and EphB6 (also known as Mep). 

"Ephrin" refers to a class of ligands which are anchored to the cell membrane through a 
transmembrane domain, and bind to the extracellular domain of an Eph receptor, facilitating 
dimerization and autophosphorylation of the receptor and autophosphorylation of the ligand. The 
ephrins which are targeted in the methods of the invention are those that bind to and activate (i.e. 
phosphorylate) an EphA or an EphB receptor, preferably an EphA receptor. The ephrin-A ligands 
(GPI-anchored ligands) are ephrin-A (also known as B61, LERK1, EFL-1), ephrin- A2 (also known as 
LERK6, Elfl, mCek7-L, cEIfl), ephrin- A3 (also known as LERK3, Ehkl-L, and EFL-2), ephrin-A4 
(also known as LERK4, EFL-4, mLERK4), ephrin-A5 (AL1, LERK7, EFL-5, mALl, [rLERK7], 
RAGS), and the ephrin-B ligands (transmembrane ligands) are ephrin-B 1 (also known as LEKR2, 
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ELK-L, EFL-3, Cek5-L, Stral, [LERK2]), ephrin-B2 (also known as LERK5, HTK-L, NLERX I , 
El£2, Htk-L), and ephrin-B3 (also known as LERK8, ELK-L3, NLERK2, EFL-6, Elf3, [rELK-L3]). 

The term "SAM domain" refers to a region known as the Sterile Alpha Motif (SAM) domain 
within the cytoplasmic regions of all Eph receptors (Figure IB), and in other proteins such as TEL 
[Jousset, 1997], members of the polycomb group of transcriptional repressors (RAE28, Scm and ph) 
[Peterson, 1997], the protein kinase Byr2p [Tu, 1997], the a and p isoforms of the Hprin scaffolding 
proteins [Serra-Pages, 1998], and tankyrase (Smith, S. et al, Science 282: 1484-1487, 1998, Acession 
AF082556). The SAM domain was identified as a conserved sequence present in a small set of yeast 
sexual differentiation proteins referred to as the Sterile Alpha Mating factors [Ponting, 1995; Schultz, 
1997]. In ETS family transcription factors this sequence has also been termed the Pointed domain 
[Klambt, 1993]. Extensive database searching and sequence alignment analysis (Figure 1A) reveals 
that this domain is found in a variety of proteins, many of which contain catalytic domains or 
recognized protein interaction domains (Figure IB). SAM domains are almost always located at a 
protein's N- or Oterminus. A highly conserved SAM domain is located in the cytoplasmic region of 
Eph receptors (approximately 50 % identity over 14 family members), C-terminal to the catalytic 
domain and followed by only 5 residues that form a potential PDZ domain binding site [Hock, 1998]. 
The term also includes amino acid sequences having substantial sequence identity to a SAM domain, 
a mutant, or a subunit of a SAM domain. Preferably the SAM domain is an "Eph SAM domain" i.e. 
a SAM domain of an Eph receptor. 

"SAM domain structure" or "SAM domain three dimensional structure" refers to the three 
dimensional structure of a purified polypeptide comprising one or more SAM domains, preferably a 
crystalline form. 

As applied to polypeptides, the term " substantial sequence identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap, 
share at least 80 percent sequence identity, preferably at least 90 percent sequence identity, more 
preferably at least 95 percent sequence identity or more. Preferably, residue positions which are not 
identical differ by conservative amino acid substitutions. For example, the substitution of amino acids 
having similar chemical properties such as charge or polarity are not likely to effect the properties of 
a protein. Examples include glutamine for asparagine or glutamic acid for aspartic acid. 

The term "mutant" refers to a polypeptide that is obtained by replacing at least one amino 
acid residue in a native SAM domain with a different amino acid residue. Mutation can be 
accomplished by adding and/or deleting amino acid residues within the native SAM domain. A 
mutant may or may not be functional. 

The term "function" refers to the ability of a modulator to enhance or inhibit the association 
between a SAM domain and a compound. 

The term "atomic structural coordinates" as used herein refers to a data set that defines the 
three dimensional structure of a molecule or molecules (e.g. unit cell axial lengths, space group). 
Structural coordinates can be slightly modified and still render nearly identical three dimensional 
structures. A measure of a unique set of structural coordinates is the root-mean-square deviation of 
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the resulting structure. Structural coordinates that render three dimensional structures that deviate 
from one another by a root-mean-square deviation of less than 1.5 A may be viewed by a person of 
ordinary skill in the art as identical. Structural coordinates for a SAM domain are in Table 2. 

The term "unit cell" refers to the smallest and simplest volume element (i.e. parallelpiped- 
shaped block) of a crystal that is completely representative of the unit of pattern of the crystal. The 
unit cell axial lengths are represented by a, b, and c where a - x axis, b = y axis, and c = z axis. Those 
of skill in the art understand that a set of atomic coordinates determined by X-ray crystallography is 
not without standard error. 

The term "space group" refers to the symmetry of a unit cell. In a space group designation 
the capital letter indicates the lattice type and the other symbols represent symmetry operations that 
can be carried out on the unit cell without changing its appearance. 

The term "purified" in reference to a polypeptide, does not require absolute purity such as a 
homogenous preparation rather it represents an indication that the sequence is relatively purer than in 
the natural environment. Generally, a purified polypeptide is substantially free of other proteins, 
lipids, carbohydrates, or other materials with which it is naturally associated, preferably at a 
functionally significant level for example at least 85% pure, more preferably at least 95% pure, most 
preferably at least 99% pure. A skilled artisan can purify a polypeptide comprising a SAM domain 
using standard techniques for protein purification. A substantially pure polypeptide comprising a Sam 
domain will yield a single major band on a non-reducing polyacrylamide gel. The purity of the SAM 
domain polypeptide can also be determined by amino-terminal amino acid sequence analysis. 
Three Dimensional Structure of SAM Domain 

The present invention provides a purified SAM domain three dimensional structure. In an 
embodiment the structure is a crystalline form. A SAM domain structure may comprise one or more 
SAM domains in a unit cell, preferably two, three or four SAM domains. In a preferred embodiment, 
a SAM domain is arranged in a crystalliine manner in a space group P6 4 so as to form a unit cell of 
dimensions a-b= 77.14 angstroms, c= 24.37 angstroms and which effectively diffracts X-rays for 
determination of the atomic coordinates of the SAM domain to a resolution of about 2.9 angstroms. 
The 3-dimensional structure of a preferred SAM domain of the invention is shown in Figures 2 and 3. 

A crystalline form includes native crystals, derivative crystals, and co-crystals. The native 
crystals generally comprise substantially pure polypeptides comprising one or more SAM domains in 
crystalline form. It is understood that the crystalline form is not limited to naturally occurring or 
native SAM domains but includes mutants of native SAM domains obtained by replacing at least one 
amino acid residue in a native SAM domain with a different amino acid residue or by adding or 
deleting amino acid residues within the native polypeptide, and having substantially the same three 
dimensional structure as the native SAM domain from which the mutant is derived i.e. having a set of 
atomic structural coordinates that have a root mean square deviation of less than or equal to about 2k 
when superimposed with the atomic structure coordinates of the native SAM domain from which the 
mutant is derived when at least 50% to 100% of the atoms of the native SAM domain are included in 
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the superimposition. It should be noted that the mutants contemplated herein need not exhibit SAM 
domain activity. 

The derivative crystals of the invention generally comprise a crystalline SAM domain in 
covalent association with one or more heavy metal atoms. The SAM domain may correspond to a 
native or mutated SAM domain. Heavy metal atoms useful for providing derivative crystals include 
by way of example, and not limitation gold, mercury, etc. 

The invention features a crystalline form of a SAM domain in association with one or more 
compounds. The association may be covalent or non-covalent. These types of crystalline forms are 
referred to herein as co-crystals. The compound may be any organic molecule, and it may modulate 
the function of a SAM domain by for example inhibiting or enhancing its function, or it may be an 
analogue of a SAM domain. It is preferred that the geometry of the compound and the interactions 
formed between the compound and the SAM domain provide high affinity binding between the two 
molecules. High affinity binding is preferably governed by a dissociation equilibrium constant on the 
order of 1 Odorless. 

Method for Preparing Crystal Forms of SAM Domain 

The invention also features a method for creating the crystalline SAM domain structures 
described herein. The method may utilize a polypeptide comprising a SAM domain described herein 
to form a crystal. A polypeptide used in the method may be chemically synthesized in whole or in 
part using techniques that are well-known in the art. Alternatively, methods are well known to the 
skilled artisan to construct expression vectors containing the native or mutated SAM domain coding 
sequence and appropriate transcriptional/translational control signals. These methods include in vitro 
recombinant DNA techniques, synthetic techniques, and in vivo recombination/genetic 
recombination. See for example the techniques described in Sambrook et al. (Molecular Cloning: A 
Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory 
textbooks. 

Crystals are grown from an aqueous solution containing the purified and concentrated SAM 
domain polypeptide by a variety of conventional processes. These processes include batch, liquid, 
bridge, dialysis, vapor diffusion, and hanging drop methods. (See for example, McPherson, 1982 
John Wiley, New York; McPherson, 1990, Eur. J. Biochem. 189: 1-23; Webber. 1991, Adv. Protein 
Chem. 41:1-36). Generally, the native crystals of the invention are grown by adding precipitants to 
the concentrated solution of the SAM domain polypeptide. The precipitants are added at a 
concentration just below that necessary to precipitate the protein. Water is removed by controlled 
evaporation to produce precipitating conditions, which are maintained until crystal growth ceases. 
In an embodiment of the invention, the method generally comprises the steps of 

(a) mixing a volume of polypeptide solution with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed 
container, under conditions suitable for crystallization. 

For crystals of the invention, it has been found that hanging drops containing about lu.1 of 
SAM domain polypeptide (50-150 mg/ml, preferably 100 mg/ml, in 5-2-mM, preferably 7mM Hepes 
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pH 5.5 to 9, preferably 7.5) and equal volumes of reservoir buffer (50-150 mM, preferably lOOmM 
cacodylate pH 5.5 to 7.5, preferably 6.5; 5-10% preferably 7% (w/v) PEG 8000; and 10-30%, 
preferably 20% (v/v) ethylene glycol) suspended overnight at room temperature provide crystals 
suitable for high resolution X-ray structure determination. It will be appreciated that the above- 
described crystallization conditions can be varied and such variations can be used alone or in 
combination. For example other buffer solutions such as Tris-HCL buffer may be used. 

Derivative crystals of the invention can be obtained by soaking native crystals in a solution 
containing salts of heavy metal atoms. Co-crystals of the invention can be obtained by soaking a 
native crystal in a solution containing a compound that binds the SAM domain, or they can be 
obtained by co-crystallizing the SAM domain polypeptide in the presence of one or more compounds 
that bind to the SAM domain. 

Once the crystal is grown it can be placed in a glass capillary tube and mounted onto a 
holding device connected to an X-ray generator and an X-ray detection device. Collection of X-ray 
diffraction patterns are well documented by those skilled in the art (See for example, Ducruix and 
Geige, 1992, IRL Press, Oxford, England). A beam of X-rays enter the crystal and diffract from the 
crystal. An X-ray detection device can be utilized to record the diffraction patterns emanating from 
the crystal. Suitable devices include the Marr 345 imaging plate detector system with an RU200 
rotating anode generator. 

Methods for obtaining the three dimensional structure of the crystalline form of a molecule 
or complex are described herein and known to those skilled in the art (see Ducruix and Geige, supra). 
Generally, the unit cell dimensions and orientation in the crystal can be determined from the spacing 
between the diffraction emissions as well as the patterns made from the emissions. The symmetry of 
the unit cell in the crystal is also determined. Each diffraction pattern emission is characterized as a 
vector and the data collected at this stage determines the amplitude of each vector. The phases of the 
vectors may be determined by the isomorphous replacement method where heavy atoms soaked into 
the crystal are used as reference points in the X-ray analysis (see for example, Otwinowski, 1991, 
Daresbury, United Kingdom, 80-86). The phases of the vectors may also be determined by molecular 
replacement (see for example, Naraza, 1994, Proteins 11:281-296). The amplitudes and phases of 
vectors from the crystalline form of an Eph SAM domain, preferably an EphA4 SAM domain, 
determined in accordance with these methods can be used to analyze other crystalline SAM domains. 

The unit cell dimensions and symmetry, and vector amplitude and phase information can be 
used in a Fourier transform function to calculate the electron density in the unit cell i.e. to generate an 
experimental electron density map. This may be accomplished used the PHASES package (Furey, 
1990). Amino acid sequence structures are fit to the experimental electron density map (ie. model 
building) using computer programs (e.g. Jones, TA. et ai, Acta Crystallogr A47, 100-1 19, 1991) to 
calculate a theoretical electron density map. The theoretical and experimental electron density maps 
can be compared and the agreement between the maps can be described by a parameter referred to as 
R-factor. A high degree of overlap in the maps is represented by a low value R-factor. The R-factor 
can be minimized by using computer programs that refine the theoretical electron density map. For 
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example, the XPLOR program, developed by Brunger (1992, Nature 355:472-475) can be used for 
model refinement. 

A three dimensional structure of the molecule or complex may be described by atoms that fit 
the theoretical electron density characterized by a minimum R value. Files can be created for the 
5 structure that defines each atom by coordinates in three dimensions. 
Identification of Homologues 

The knowledge of the three dimensional structure of a SAM domain, in particular the 
EphA4 SAM domain, enables one skilled in the art to identify homologues. This is achieved by 
searches of three-dimensional databases. Since structural folds are conserved to a greater extent than 
10 sequence, one may identify homologues with very little sequence similarity. Programs that provide 
this type of database searching are known in the art and include Dali. The structural coordinates of a 
protein structure are submitted and the program performs a multiple structural alignment with 
proteins in the protein data bank. 

Methods for Determining Three Dimensional Structures 

15 The structure coordinates of a SAM domain structure described herein can be used as a 

model for determining the three dimensional structures of additional native or mutated SAM domains 
with unknown structure, as well as the structures of co-crystals of SAM domains with compounds 
such as modulators (e.g. agonists or antagonists). The structure coordinates and models of a SAM 
domain three dimensional structure can also be used to determine solution-based structures of native 

20 or mutant SAM domains. 

Three dimensional structure may be determined by applying the structural coordinates of a 
SAM domain structure to other data such as an amino acid sequence, X-ray crystallographic 
diffraction data, or nuclear magnetic resonance (NMR) data. Homology modeling, molecular 
replacement, and nuclear magnetic resonance methods using these other data sets are described 

25 below. 

Homology modeling (also known as comparative modeling or knowledge-based modeling) 
methods develop a three dimensional model from a polypeptide sequence based on the structures of 
known proteins. In the present invention the method utilizes a computer representation of the three 
dimensional structure of a SAM domain, preferably the EphA SAM domain, more preferably the 

30 EphA4 SAM domain, or a complex of same, a computer representation of the amino acid sequence of 
a polypeptide with an unknown structure, and standard computer representations of the structures of 
amino acids. The method in particular comprises the steps of; (a) identifying structurally conserved 
and variable regions in the known structure; (b) aligning the amino acid sequences of the known 
structure and unknown structure (c) generating coordinates of main chain atoms and side chain 

35 atoms in structurally conserved and variable regions of the unknown structure based on the 
coordinates of the known structure thereby obtaining a homology model; and (d) refining the 
homology model to obtain a three dimensional structure for the unknown structure. This method is 
well known to those skilled in the art (Greer, 1985, Sceince 228, 1055; Bundell et al 1988, Eur. J. 
Biochem. 172, 513; Knighton et al., 1992, Science 258:130-135, 
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http://biochem.vt.edu/courses/modeling/homology.htn). Computer programs that can be used in 
homology modeling are Quanta and the Homology module in the Insight II modeling package 
distributed by Molecular Simulations Inc, or MODELLER (Rockefeller University, 
www.iucr.ac.uk/sinris-top/logical/prg-modeller.html). 

In step (a) of the homology modeling method, the known SAM domain structure (e.g. 
structure of the EphA4 SAM domain) is examined to identify the structurally conserved regions 
(SCRs) from which an average structure, or framework, can be constructed for these regions of the 
protein. Variable regions (VRs), in which known structures may differ in conformation, also must be 
identified. SCRs generally correspond to the elements of secondary structure, such as alpha-helices 
(the four a-helices in the EphA4 SAM domain) and beta-sheets, and to ligand- and substrate-binding 
sites. The VRs usually lie on the surface of the proteins and form the loops where the main chain 
turns. 

Many methods are available for sequence alignment of known structures and unknown 
structure. Sequence alignments generally are based on the dynamic programming algorithm of 
Needleman and Wunsch [J. Mol. Biol. 48: 442-453, 1970]. Current methods include FASTA, Smith- 
Waterman, and BLASTP, with the BLASTP method differing from the other two in not allowing 
gaps. Scoring of alignments typically involves construction of a 20x20 matrix in which identical 
amino acids and those of similar character (i.e., conservative substitutions) may be scored higher than 
those of different character. Substitution schemes which may be used to score alignments include the 
scoring matrices PAM (Dayhoff et a!., Meth. Enzymol. 91: 524-545, 1983), and BLOSUM (Henikoff 
and Henikoff, Proc. Nat. Acad. Sci. USA 89: 10915^0919, 1992), and the matrices based on 
alignments derived from three-dimensional structures including that of Johnson and Overington (JO 
matrices) (J. Mol. Biol. 233: 716-738, 1993). 

Alignment based solely on sequence may be used, though other structural features also may 
be taken into account. In Quanta, multiple sequence alignment algorithms are available that may be 
used when aligning a sequence of the unknown with the known structures. Four scoring systems (i.e. 
sequence homology, secondary structure homology, residue accessibility homology, CA-CA distance 
homology) are available, each of which may be evaluated during an alignment so that relative 
statistical weights may be assigned. 

When generating coordinates for the unknown structure, main chain atoms and side chain 
atoms, both in SCRs and VRs need to be modeled. A variety of approaches known to those skilled in 
the art may be used to assign coordinates to the unknown. In particular, the coordinates of the main 
chain atoms of SCRs will be transferred to the unknown structure. VRs correspond most often to the 
loops on the surface of the polypeptide and if a loop in the known structure is a good model for the 
unknown, then the main chain coordinates of the known structure may be copied. Side chain 
coordinates of SCRs and VRs are copied if the residue type in the unknown is identical to or very 
similar to that in the known structure. For other side chain coordinates, a side chain rotamer library 
may be used to define the side chain coordinates. When a good model for a loop cannot be found 
fragment databases may be searched for loops in other proteins that may provide a suitable model for 
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the unknown. If desired, the loop may then be subjected to conformational searching to identify low 
energy conformers if desired. 

Once a homology model has been generated it should be analyzed to determine its 
correctness. A computer program available to assist in this analysis is the Protein Health module in 
Quanta which provides a variety of tests. Other programs that provide structure analysis along with 
output include PROCHECK and 3D-Profiler [Luthy R. et al, Nature 356: 83-85, 1992; and Bowie, 
J.U. et al, Science 253: 164-170, 1991], Once any irregularities have been resolved, the entire 
structure may be further refined. Refinement may consist of energy minimization with restraints, 
especially for the SCRs. Restraints may be gradually removed for subsequent minimizations. 
Molecular dynamics may also be applied in conjunction with energy minimization. 

Molecular replacement involves applying X-ray diffraction data of a known structure to the 
incomplete X-ray crystallographic data set of a polypeptide of unknown structure. The method can be 
used to define the phases describing the X-ray diffraction data of a polypeptide of unknown structure 
when only the amplitudes are known. Commonly used computer software packages for molecular 
replacement are X-PLOR (Brunger 1992, Nature 355: 472-475), AMoRE (Navaza, 1994, Acta 
Crystallogr. A50:157-163), the CCP4 package (Collaborative Computational Project, Number 4, 
"The CCP4 Suite: Programs for Protein Crystallography", Acta Cryst., VoL D50, pp. 760-763, 1994), 
and the MERLOT package (P.M.D. Fitzgerald, J. Appl. Cryst, Vol. 21, pp. 273-278, 1988). It is 
preferable that the resulting structure not exhibit a root-mean-square deviation of more than 3 A. 

The objective of molecular replacement is to align positions of atoms in the unit cell by 
matching electron diffraction data from two crystals. Molecular replacement computer programs 
generally involve the following steps: (1) determining the number of molecules in the unit cell and 
defining the angles between them; (2) rotating the diffraction data to define the orientation of the 
molecules in the unit cell; (3) translating the electron density in three dimensions to correctly position 
the molecules in the unit cell; (4) determining the amplitudes and phases of the X-ray diffraction data 
and calculating an R-factor calculated from the reference data set and from the new data wherein an 
R-factor between 30-50% indicates that the orientations of the atoms in the unit cell have been 
reasonably determined by the method; and (5) optionally decreasing the R-factor to about 20% by 
refining the new electron density map using iterative refinement techniques known to those skilled in 
the art. 

In an embodiment of the invention, a method is provided for determining three dimensional 
structures of polypeptides with unknown structure by applying the structural coordinates of a SAM 
domain structure to an incomplete X-ray crystallographic data set for a polypeptide of unknown 
structure, and determining a low energy conformation of the resulting structure. 

The structural coordinates of a SAM domain structure may be applied to nuclear magnetic 
resonance (NMR) data to determine the three dimensional structures of polypeptides. (See for 
example, Wuthrich, 1986, John Wiley and Sons, New York: 176-199; Pflugrath et al., 1986, J. 
Molecular Biology 189: 383-386; Kline et al., 1986 J. Molecular Biology 189:377-382). While the 
secondary structure of a polypeptide may often be determined by NMR data, the spatial connections 
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between individual pieces of secondary structure are not as readily determined. The structural 
coordinates of a polypeptide defined by X-ray crystallography can guide the NMR spectroscopist to 
an understanding of the spatial interactions between secondary structural elements in a polypeptide of 
related structure. Information on spatial interactions between secondary structural elements can 
greatly simplify Nuclear Overhauser Effect (NOE) data from two-dimensional NMR experiments. In 
addition, applying the structural coordinates after the determination of secondary structure by NMR 
techniques simplifies the assignment of NOE's relating to particular amino acids in the polypeptide 
sequence and does not greatly bias the NMR analysis of polypeptide structure. 

In an embodiment, the invention relates to a method of determining three dimensional 
structures of polypeptides with unknown structures by applying the structural coordinates of a SAM 
domain structure to nuclear magnetic resonance (NMR) data of the unknown structure. This method 
comprises the steps of: (a) determining the secondary structure of an unknown structure using NMR 
data; and (b) simplifying the assignment of through-space interactions of amino acids. The term " 
through-space interactions" defines the orientation of the secondary structural elements in the three 
dimensional structure and the distances between amino acids from different portions of the amino 
acid sequence. The term "assignment" defines a method of analyzing NMR data and identifying 
which amino acids give rise to signals in the NMR spectrum. 
Identification of Potential Modulators of SAM Domains 

Modulators of a SAM domain may be designed and identified that may modify the 
inappropriate activity of a SAM domain involved in a clinical disorder. The rational design and 
identification of modulators of SAM domains can be accomplished by utilizing the atomic structural 
coordinates that define a SAM domain's three dimensional structure. 

Modulators may include substances that bind to or mimic the residues of a SAM domain that 
are required for dimerization of SAM domains. For example, a substance that binds to or mimics the 
interface residues of an EphA SAM domain (e.g. Val 913, Vai 914, Met 972, Met 976, Met 979, Val 
944, and Leu 940), or the proximal residues of an EphA SAM domain (e.g. He 959 to Lys) may 
modify inappropriate activity of a SAM domain involved in a clinical disorder. 

Structure-based modulator design identification methods are powerful techniques that can 
involve searches of computer databases containing a variety of potential modulators and chemical 
functional groups. (See Kuntz et al., 1994, Acc. Chem. Res. 27:1 17; Guida, 1994, Current Opinion in 
Struc. Biol. 4: 777; and Colman, 1994, Current Opinion in Struc. Biol. 4: 868, for reviews of 
structure-based drug design and identification ;and Kuntz et al 1982, J. Mol. Biol. 162:269; Kuntz et 
al., 1994, Acc. Chem. Res. 27: 117; Meng et al., 1992, J. Compt. Chem. 13: 505; Bohm, 1994, J. 
Comp. Aided Molec. Design 8: 623 for methods of structure-based modulator design). 

The SAM domain three dimensional structure described herein, and the three dimensional 
structures of other polypeptides determined by the homology modeling, molecular replacement, and 
NMR techniques described herein can also be applied to modulator design and identification 
methods. 
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Modulators of SAM domains may be identified by docking the computer representation of 
compounds from a database of molecules. Databases which may be used include A CD (Molecular 
Designs Limited), NCI (National Cancer Institute), CCDC (Cambridge Crystallographic Data 
Center), CAST (Chemical Abstract Service), Derwent (Derwent Information Limited), Maybridge 
(Maybridge Chemical Company Ltd), Aldrich (Aldrich Chemical Company), DOCK (University of 
California in San Francisco), and the Directory of Natural Products (Chapman & Hall). Computer 
programs such as CONCORD (Tripos Associates) or DB-Converter (Molecular Simulations Limited) 
can be used to convert a data set represented in two dimensions to one represented in three 
dimensions. 

Generally, the computer programs comprise the following steps: 

(a) docking the structure of a compound into an active-site of a polypeptide (e.g.. EphA4 
SAM domain) using the computer program, or by interactively moving the compound 
into the active-site; 

(b) characterizing the geometry and the complementary interactions formed between the 
atoms of the active-site and the compound; and optionally 

(c) searching libraries for molecular fragments which can fit into the empty space between 
the compound and active site and can be linked to the compound; and 

(d) linking the fragments found in (c) to the compound and evaluating the new modified 
compound. 

"Docking" refers to a process of placing a compound in close proximity with an active site 
of a polypeptide (e.g.. an Eph SAM domain), or a process of finding low energy conformations of a 
compound/polypeptide complex (e.g. compound/Eph SAM domain). 

Examples of other computer programs that may be used for structure-based modulator 
design are CAVEAT (Bartlett et al., 1989, in "Chemical and Biological Problems in Molecular 
Recognition", Roberts, S.M. Ley, S.V.; Campbell, N.M. eds; Royal Society of Chemistry: 
Cambridge, pp 182-196); FLOG (Miller et al., 1994, J. Comp. Aided Molec. Design 8:153); PRO 
Modulator (Clark et al., 1995 J. Comp. Aided Molec. Design 9:13); MCSS (Miranker and Karplus, 
1991, Proteins: Structure, Fuction, and Genetics 8:195); and, GRID (Goodford, 1985, J. Med. Chem. 
28:849). 

In an embodiment of the invention, a method is provided for identifying potential 
modulators of SAM domain function. The method utilizes the structural coordinates of a SAM 
domain three dimensional structure. The method comprises the steps of (a) removing a computer 
representation of a SAM domain structure, preferably an Eph SAM domain structure, more 
preferably an EphA4 SAM domain structure, and docking a computer representation of a compound 
from a computer data base with a computer representation of the active site of the SAM domain; (b) 
determining a conformation of the complex with a favourable geometric fit or favorable 
complementary interactions; and (c) identifying compounds that best fit the SAM domain active-site 
as potential modulators of SAM domain function. The initial SAM domain structure may or may not 
have compounds bound to it. A favourable geometric fit occurs when the surface areas of a 
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compound in a compound-SAM domain complex is in close proximity with the surface area of the 
active-site of the SAM domain without forming unfavorable interactions. A favourable 
complementary interaction occurs where a compound in a compound-SAM domain complex interacts 
by hydrophobic, aromatic, ionic, or hydrogen donating and accepting forces, with the active-site of a 
SAM domain without forming unfavorable interactions. Unfavourable interactions may be steric 
hindrance between atoms in the compound and atoms in the SAM active-site. 

In another embodiment, potential modulators are identified utilizing a three dimensional 
structure of a SAM domain with or without compounds bound to it. The method comprises the steps 
of (a) modifying a computer representation of a SAM domain (e.g. an Eph SAM domain) having one 
or more compounds bound to it, where the computer representations of the compound or compounds 
and SAM domain are defined by atomic structural coordinates; (b) determining a conformation of the 
complex with a favorable geometric fit and favorable complementary interactions; and (c) identifying 
the compounds that best fit the SAM active site as potential modulators. A computer representation 
may be modified by deleting or adding a chemical group or groups. Computer representations of the 
chemical groups can be selected from a computer database. 

Another way of identifying potential modulators is to modify an existing modulator in the 
polypeptide active-site. The computer representation of modulators can be modified within the 
computer representation of a SAM domain active-site. This technique is described in detail in 
Molecular Simulations User Manual, 1995 in LUDI. The computer representation of a modulator 
may be modified by deleting a chemical group or groups, or by adding a chemical group or groups. 
After each modification to a compound, the atoms of the modified compound and active-site can be 
shifted in conformation and the distance between the modulator and the active site atoms may be 
scored on the basis of geometric fit and favourable complementary interactions between the 
molecules. Compounds with favourable scores are potential modulators. 

Compounds designed by modulator building or modulator searching computer programs 
may be screened to identify potential modulators. Examples of such computer programs include 
programs in the Molecular Simulations Package (Catalyst), ISIS/HOST, ISIS/BASE, and 
ISIS/DRAW (Molecular Designs Limited), and UNITY (Tripos Associates). A building program may 
be used to replace computer representations of chemical groups in a compound complexed with a 
SAM domain with groups from a computer data base. A searching program may be used to search 
computer representations of compounds from a computer database that have similar three 
dimensional structures and similar chemical groups as a compound that binds to a SAM domain. The 
programs may be operated on the structure of the active-site of the three dimensional structure of an 
Eph SAM domain, preferably an EphA4 SAM domain. 

A typical program may comprise the following steps: 

(a) mapping chemical features of the compound such as by hydrogen bond donors or 
acceptors, hydrophobic/lipophilic sites, positively ionizable sites, or negatively 
ionizable sites; 

(b) adding geometric constraints to selected mapped features; 
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(c) searching data bases with the model generated in (b). 
In an embodiment of the invention a method of identifying potential modulators of a SAM 
domain, preferably an Eph SAM domain, more preferably an EphA SAM domain, is provided using 
the three dimensional conformation of the SAM domain in various modulator construction or 
5 modulator searching computer programs on compounds complexed with the SAM domain. The 
method comprises the steps of (a) removing a computer representation of one or more compounds 
complexed with a SAM domain; (b) (i) searching a data base for a compound with a similar 
geometric structure or similar chemical groups to the removed compounds using a computer program 
that searches computer representations of compounds from a database that have similar three 

10 dimensional structures and similar chemical groups, or (ii) replacing portions of the compounds 
complexed with the SAM domain with similar chemical structures (i.e. nearly identical shape and 
volume) from a database using a compound construction computer program that replaces computer 
representations of chemical groups with groups from a computer database, where the representations 
of the compounds are defined by structural coordinates. 

15 Potential modulators of SAM domains identified using the above-described methods may be 

prepared using methods described in standard reference sources utilized by those skilled in the art. 
For example, organic compounds may be prepared by organic synthetic methods described in 
references such as March, 1994 Advanced Organic Chemistry: Reactions, Mechanisms, and 
Structure, New York, McGraw Hill. 

20 Cellular assays, as well as animal model assays in vivo, may be used to test the activity of a 

potential modulator of a SAM domain as well as diagnose a disease associated with inappropriate 
SAM domain activity. In vivo assays are also useful for testing the bioactivity of a potential 
modulator designed by the methods of the invention. 

The invention also relates to a potential modulator identified by the methods of the 

25 invention. 
Peptides 

The invention provides peptide molecules that modulate SAM domain function. The 
molecules are derived from the interface residues necessary for dimer formation. For example, 
peptides of the invention include the amino acids Val 913, Val 914, Met 972, Met 976, Met 979, Val 

30 944, and Leu 940 of the EphA4 SAM domain. Other proteins containing sequences corresponding to 
the sequences necessary for dimer formation of a SAM domain may be identified with a protein 
homology search, for example by searching available databases such as GenBank or SwissProt and 
various search algorithms and/or programs may be used including FASTA, BLAST (available as a 
part of the GCG sequence analysis package, University of Wisconsin, Madison, Wis.), or ENTREZ 

35 (National Center for Biotechnology Information, National Library of Medicine, National Institutes of 
Health, Bethesda, MD). 

In accordance with an embodiment of the invention, specific peptides are contemplated that 
mediate SAM domain function comprising VVSV (SEQ ID. NO. 21), SAVVSV (SEQ ID. NO.22), 
FSAVV (SEQ ID. NO.23 ), FSAVVSV (SEQ ID. NO. 24), FSAVVSVGD (SEQ ID. NO. 25), 
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VVSVGDWL (SEQ ID. NO. 26), FN TV (SEQ ID. NO. 27), FNTVDE (SEQ ID. NO. 28), 
FNTVDEWL (SEQ ID. NO. 29), TSFNTVDEWL (SEQ ID. NO. 30), TSFNTV (SEQ ID. NO. 31), 
YTSFNTV (SEQ ID. NO. 32), RSEV (SEQ ID. NO. 33), RSEVLG (SEQ ID. NO. 34), RSEVLGWD 
(SEQ ID. NO. 35), VPFRSEV (SEQ ID. NO. 36), and VPFRSEVLGW (SEQ ID. NO. 37). 

In accordance with another embodiment of the invention, specific peptides are contemplated 
that mediate SAM domain function. In particular, a peptide of the formula I is provided which 
mediates SAM domain function: 

X-X'-X^X'-X'-X^X 6 I 

wherein X and X 6 represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 amino 
acids, and X 1 represents Leu, Phe, Asp, Ala, GIu, or Gly, preferably Leu or Gly, X 2 represents Glu, 
Asp, Ser, He, Ala, Arg, Lys, and Gin, preferably GIu or Asp, X 3 represents Ala, Val, Glu, Phe, Ser, 
He, Met, Leu, His, Gin, Arg, or Asp preferably Ala, Val, or Phe, X 4 is Val, Leu, Met, Phe, and He, 
preferably Val or Leu, or Phe, X 5 is Val, Ser, Leu, Asp, Ala, Pro, Asn, Lys, or Cys, preferably Val or 
Ser. 

In an embodiment of the present invention a peptide of the formula 1 is provided: 
wherein X represents TT, ID, TS, DD, GYTT (SEQ ID. NO. 38), AAGYTT (SEQ ID. NO. 39), 
FTAAGYTT (SEQ ID. NO. 40), DNFTAAGYTT (SEQ ID. NO. 41), or YKDNFTAAGYTT (SEQ 
ID. NO. 42). In another embodiment X 6 represents HM, HMSQ (SEQ ID. NO. 43), HMSQD (SEQ 
ID. NO. 44), HMSQDD (SEQ ID. NO. 45), HMSQDDLA (SEQ ID. NO. 46), QMMM (SEQ ID. NO. 
47), QMMMED (SEQ ID. NO. 48), QMMMEDLL (SEQ ID. NO. 49), DITE (SEQ ID. NO. 50), 
DITEED (SEQ ID. NO. 51), DITEEDL (SEQ ID. NO. 52), NLTE (SEQ ID. NO. 53), NLTEND 
(SEQ ID. NO. 54), NLTENDI (SEQ ID. NO. 55). 

Preferred peptides of the formula I include the following: X-LEAVV-X 6 , X-FDVVS-X 6 , X- 
LEFLS-X 6 , X-GARFL-X 6 , LEAVV (SEQ ID. NO. 56), TTLEAVV (SEQ ID. NO. 57), LEAVVHM 
(SEQ ID. NO. 58), LEAVVHMSQ (SEQ ID. NO. 59), LEAWHMSQD (SEQ ID. NO. 60), 
LEAVVHMSQDDL (SEQ ID. NO. 61), LEAVVHM SQDDLAR (SEQ ID. NO. 62), 
TTLEAVVHMS (SEQ ID. NO. 63), TTLEAVVHMSQD (SEQ ID. NO. 64), TTLEAVVHMSQDDL 
(SEQ ID. NO. 65), TTLEAVVHMSQDDLAR (SEQ ID. NO. 66), GYTTLEA VV (SEQ ID. NO. 67), 
GYTTLEAVVHMS (SEQ ID. NO. 68), GYTTLEA VVHMSQD (SEQ ID. NO. 69), 
GYTTLEA VVHMSQDDL (SEQ ID. NO. 70), GYTTLEA VVHMSQDDLAR (SEQ ID. NO. 71), 
FDVVS (SEQ ID. NO. 72), FDVVSQ (SEQ ID. NO. 73), FDVVSQMM (SEQ ID. NO. 74), 
FDVVSQMMME (SEQ ID. NO. 75), FDVVSQMMMEDIL (SEQ ID. NO. 76), TSFDVVS (SEQ ID. 
NO. 77), TSFDVVSQ (SEQ ID. NO. 78), TSFDVVSQMM (SEQ ID. NO. 79), TSFDVVSQMMME 
(SEQ ID. NO. 80), TSFDVVSQMMMEDIL (SEQ ID. NO. 81), LEFLS (SEQ ID. NO. 82), LEFLSD 
(SEQ ID. NO. 83), LEFLSDIT (SEQ ID. NO. 84), LEFLSDITEE (SEQ ID. NO. 85), 
LEFLSDITEEDL (SEQ ID. NO. 86), DDLEFLS (SEQ ID. NO. 87), GWDDLEFLS (SEQ ID. NO. 
88), DDLEFLSD (SEQ ID. NO. 89), DDLEFLSDIT (SEQ ID. NO. 90), DDLEFLSDITEE (SEQ ID. 
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NO. 91), DDLEFLSDITEEDL (SEQ ID. NO. 92), GARFL (SEQ ID. NO. 93), GARFLN (SEQ ID. 
NO. 94), GARFLN LT (SEQ ID. NO. 95), GARFLNLTEN (SEQ ID. NO. 96), and IDGARFL (SEQ 
ID. NO. 97). 

In accordance with another embodiment of the invention, specific peptides are contemplated 
5 that mediate SAM domain function. In particular, a peptide of the formula II is provided which 
mediates SAM domain function: 

X 7 -X 8 -X 9 -X ,0 -X n -X 12 -X 13 -X I4 -X 15 -X 16 II 

0 wherein X 7 and X 16 represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 amino 
acids, and X 8 represents Met, He, Ser, Leu, Asn, Phe, or Val, preferably Met, X 9 represents Arg, Ser, 
Lys, Met, Leu, Glu, Gin, or Asn, preferably Gin or Arg, X 10 represents Thr, Ala, Arg, Leu, Ser, Glu, 
Asp, Met, Lys, Gin, or Gly, preferably Thr, Ala, or Glu, X n represents Gin, Ser, Glu, Leu, Phe, Asp, 
Thr, Arg, preferably Gin or Arg, X 12 represents Met, Ala, He, Asn, Ser, Arg, Thr, Pro, Leu, Gin, Val, 
5 Lys, preferably Met or Arg, X 13 represents Gin, Asn, Pro, Ser, Tyr, Glu, Leu, Arg, or Lys, preferably 
Gin, Asn, or Arg, X 14 represents Gin, Ala, Pro, Asp, Leu, Lys, He, Glu, Arg, or Asn, preferably Gin 
or He, and X 15 represents Met, lie, Val, His, Ser, Arg, Lys, Phe, Cys, Glu, Tyr, Ala, lie, Trp, or Leu. 

In an embodiment of the present invention a peptide of the formula II is provided: 
wherein X 7 represents QA, QV, NK, SVQA (SEQ ID. NO. 98), LSSVQA (SEQ ID. NO. 99), 
ILSSVQA (SEQ ID. NO. 100), NKILSSVQA (SEQ ID. NO. 101), HQNKILSSVQA (SEQ ID. NO. 
102), THQNKILSSVQA (SEQ ID. NO. 103), EN1K (SEQ ID. NO. 104), SQEINK (SEQ ID. NO. 
105), KLSQEINK (SEQ ID. NO. 106), ILNSIQV (SEQ ID. NO. 107), or NS1QV (SEQ ID. NO. 
108). In another embodiment X 7 is HG, QS, HGRM (SEQ ID. NO. 109), HGRMVP (SEQ ID. NO. 
1 10), QSVEV (SEQ ID. NO. 1 1 1), or TRKP (SEQ ID. NO. 1 12). 

Preferred peptides of the formula II include the following: X 7 -MRTQMQQM-X 16 , X 7 - 
MRAQMNQI-X l6 , X 7 -NEERRSIF-X 16 , MRTQMQQM (SEQ ID. NO. 113), QAMRTQMQQM 
(SEQ ID. NO. 1 14), SVQAMRTQMQQM (SEQ ID. NO. 115), LSSVQAMRTQMQQM (SEQ ID. 
NO. 116), ILSSVQAMRTQMQQM (SEQ ID. NO. 117), MRTQMQQMHG (SEQ ID. NO. 118), 
MRTQMQQMHGRM (SEQ ID. NO. 119), MRTQMQQMHGRMVPV (SEQ ID. NO. 120), 
NEERRSIF (SEQ ID. NO. 121), 1NKNEERRSIF (SEQ ID. NO. 122), NEERRSIFTRKP (SEQ ID. 
NO. 123). MRAQMNQI (SEQ ID. NO. 124), MRAQMNQIQS (SEQ ID. NO. 125), 
MRAQMNQIQSVEV (SEQ ID. NO. 126). 

All of the peptides of the invention, as well as molecules substantially homologous, 
complementary or otherwise functionally or structurally equivalent to these peptides may be used for 
purposes of the present invention. In addition to full-length peptides of the invention, truncations of 
the peptides are contemplated in the present invention. Truncated peptides may comprise peptides of 
about 7 to 1 0 amino acid residues 

The truncated peptides may have an amino group (-NH2), a hydrophobic group (for 
example, carbobenzoxyl, dansyl, or T-butyloxycarbonyl), an acetyl group, a 9-fluorenylmethoxy- 
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carbony! (PMOC) group, or a macromolecuie including but not limited to lipid-fatty acid conjugates, 
polyethylene glycol, or carbohydrates at the amino terminal end. The truncated peptides may have a 
carboxyl group, an amido group, a T-butyloxycarbonyl group, or a macromolecuie including but not 
limited to lipid-fatty acid conjugates, polyethylene glycol, or carbohydrates at the carboxy terminal 
5 end. 

The peptides of the invention may also include analogs of a peptide of the invention and/or 
truncations of the peptide, which may include, but are not limited to a peptide of the invention 
containing one or more amino acid insertions, additions, or deletions, or both. Analogs of the peptide 
of the invention exhibit the activity characteristic of the peptide e.g. interference with SAM domain 
0 dimer formation, and may further possess additional advantageous features such as increased 
bioavailability, stability, or reduced host immune recognition. One or more amino acid insertions may 
be introduced into a peptide of the invention. Amino acid insertions may consist of a single amino 
acid residue or sequential amino acids. 

One or more amino acids, preferably one to five amino acids, may be added to the right or 
5 left termini of a peptide of the invention. Deletions may consist of the removal of one or more amino 
acids, or discrete portions from the peptide sequence. The deleted amino acids may or may not be 
contiguous. The lower limit length of the resulting analog with a deletion mutation is about 7 amino 
acids. 

The invention also includes a peptide conjugated with a selected protein, or a selectable 
marker (see below) to produce fusion proteins. 

The peptides of the invention may be prepared using recombinant DNA methods. 
Accordingly, nucleic acid molecules which encode a peptide of the invention may be incorporated in 
a known manner into an appropriate expression vector which ensures good expression of the peptide. 
Possible expression vectors include but are not limited to cosmids, plasmids, or modified viruses so 
long as the vector is compatible with the host cell used. The expression vectors contain a nucleic acid 
molecule encoding a peptide of the invention and the necessary regulatory sequences for the 
transcription and translation of the inserted protein-sequence. Suitable regulatory sequences may be 
obtained from a variety of sources, including bacterial, fungal, viral, mammalian, or insect genes. 
(For example, see the regulatory sequences described in Goeddel, Gene Expression Technology: 
Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Selection of appropriate 
regulatory sequences is dependent on the host cell chosen, and may be readily accomplished by one 
of ordinary skill in the art. Other sequences, such as an origin of replication, additional DNA 
restriction sites, enhancers, and sequences conferring inducibility of transcription may also be 
incorporated into the expression vector. 

The recombinant expression vectors may also contain a selectable marker gene which 
facilitates the selection of transformed or transfected host cells. Suitable selectable marker genes are 
genes encoding proteins such as G4 18 and hygromycin which confer resistance to certain drugs, P- 
galactosidase, chloramphenicol acetyltransferase, firefly luciferase, or an immunoglobulin or portion 
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thereof such as the Fc portion of an immunoglobulin preferably IgG. The selectable markers may be 
introduced on a separate vector from the nucleic acid of interest. 

The recombinant expression vectors may also contain genes that encode a fusion portion 
which provides increased expression of the recombinant peptide; increased solubility of the 
recombinant peptide; and/or aid in the purification of the recombinant peptide by acting as a ligand in 
affinity purification. For example, a proteolytic cleavage site may be inserted in the recombinant 
peptide to allow separation of the recombinant peptide from the fusion portion after purification of 
the fusion protein. Examples of fusion expression vectors include pGEX (Amrad Corp., Melbourne, 
Australia), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, Piscataway, NJ) 
which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to 
the recombinant protein. 

Recombinant expression vectors may be introduced into host cells to produce a transformant 
host cell. Transformant host cells include prokaryotic and eukaryotic cells which have been 
transformed or transfected with a recombinant expression vector of the invention. The terms 
"transformed with", "transfected with", "transformation" and "transfection" are intended to include 
the introduction of nucleic acid (e.g. a vector) into a cell by one of many techniques known in the art. 
For example, prokaryotic cells can be transformed with nucleic acid by electroporation or calcium- 
chloride mediated transformation. Nucleic acid can be introduced into mammalian cells using 
conventional techniques such as calcium phosphate or calcium chloride co-precipitation, DEAE- 
dextran-mediated transfection, lipofectin, electroporation or microinjection. Suitable methods for 
transforming and transfecting host cells may be found in Sambrook et ah (Molecular Cloning: A 
Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory 
textbooks. 

Suitable host cells include a wide variety of prokaryotic and eukaryotic host cells. For 
example, the peptides of the invention may be expressed in bacterial cells such as E. coli 9 insect cells 
(using baculovirus), yeast cells or mammalian cells. Other suitable host cells can be found in 
Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
CA(1991). 

The peptides of the invention may be tyrosine phosphorylated using the method described in 
Reedijk et al. (The EMBO Journal 1 1(4): 1365, 1992). For example, tyrosine phosphorylation may be 
induced by infecting bacteria harbouring a plasm id containing a nucleotide sequence encoding a 
peptide of the invention, with a XgtW bacteriophage encoding the cytoplasmic domain of the Elk 
tyrosine kinase as a LacZ-Elk fusion. Bacteria containing the plasmid and bacteriophage as a lysogen 
are isolated. Following induction of the lysogen, the expressed peptide becomes phosphorylated by 
the Elk tyrosine kinase. 

The peptides of the invention may be synthesized by conventional techniques. For example, 
the peptides may be synthesized by chemical synthesis using solid phase peptide synthesis. These 
methods employ either solid or solution phase synthesis methods (see for example, J. M. Stewart, and 
J.D. Young, Solid Phase Peptide Synthesis, 2 nd Ed., Pierce Chemical Co., Rockford III. (1984) and G. 
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Barany and R.B. Merrifield, The Peptides: Analysis Synthesis, Biology editors E. Gross and J. 
Meienhofer Vol. 2 Academic Press, New York, 1980, pp. 3-254 for solid phase synthesis techniques; 
and M Bodansky, Principles fo Peptide Synthesis, Springer- Verlag, Berlin 1984, and E. Gross and J. 
Meienhofer, Eds., The Peptides: Analysis, Synthesis, Biologu, suprs, Vol 1, for classical solution 
synthesis). By way of example, the peptides may be synthesized using 9-fluorenyl methoxycarbonyl 
(Fmoc) solid phase chemistry with direct incorporation of phosphoryrosine as the N- 
fluorenylmethoxy-carbonyl-O-dimethyl phosphono-L-tyrosine derivative. 

N-terminal or C-terminal fusion proteins comprising a peptide of the invention conjugated 
with other molecules may be prepared by fusing, through recombinant techniques, the N-terminal or 
C-terminal of the peptide, and the sequence of a selected protein or selectable marker with a desired 
biological function. The resultant fusion proteins contain the peptide fused to the selected protein or 
marker protein as described herein. Examples of proteins which may be used to prepare fusion 
proteins include immunoglobulins, glutathione-S-transferase (GST), hemagglutinin (HA), and 
truncated myc. 

Cyclic derivatives of the peptides of the invention are also part of the present invention. 
Cyclization may allow the peptide to assume a more favorable conformation for association with 
molecules in complexes of the invention. Cyclization may be achieved using techniques known in the 
art. For example, disulfide bonds may be formed between two appropriately spaced components 
having free sulfhydryl groups, or an amide bond may be formed between an amino group of one 
component and a carboxyl group of another component. Cyclization may also be achieved using an 
azobenzene-containing amino acid as described by Ulysse, L., et al., J. Am. Chem. Soc. 1995, 117, 
8466-8467. The side chains of Tyr and Asn may be linked to form cyclic peptides. The components 
that form the bonds may be side chains of amino acids, non-amino acid components or a combination 
of the two. In an embodiment of the invention, cyclic peptides are contemplated that have a beta-turn 
in the right position. Beta-turns may be introduced into the peptides of the invention by adding the 
amino acids Pro-Gly at the right position. 

It may be desirable to produce a cyclic peptide that is more flexible than the cyclic peptides 
containing peptide bond linkages as described above. A more flexible peptide may be prepared by 
introducing cysteines at the right and left position of the peptide and forming a disulphide bridge 
between the two cysteines. The two cysteines are arranged so as not to deform the beta-sheet and 
turn. The peptide is more flexible as a result of the length of the disulfide linkage and the smaller 
number of hydrogen bonds in the beta-sheet portion. The relative flexibility of a cyclic peptide can be 
determined by molecular dynamics simulations. Peptide mimetics may be designed based on 
information obtained by systematic replacement of L-amino acids by D-amino acids, replacement of 
side chains with groups having different electronic properties, and by systematic replacement of 
peptide bonds with amide bond replacements. Local conformational constraints can also be 
introduced to determine conformational requirements for activity of a candidate peptide mimetic. The 
mimetics may include isosteric amide bonds, or D-amino acids to stabilize or promote reverse turn 
conformations and to help stabilize the molecule. Cyclic amino acid analogues may be used to 
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constrain amino acid residues to particular conformational states. The mimetics can also include 
mimics of inhibitor peptide secondary structures. These structures can model the 3-dimensionaI 
orientation of amino acid residues into the known secondary conformations of the proteins. Peptoids 
may also be used which are oligomers of N-substituted amino acids and can be used as motifs for the 
generation of chemically diverse libraries of novel molecules. 

Peptides of the invention may be developed using a biological expression system. The use of 
these systems allows the production of large libraries of random peptide sequences and the screening 
of these libraries for peptide sequences that interact with particular amino acid residues. Libraries 
may be produced by cloning synthetic DNA that encodes random peptide sequences into appropriate 
expression vectors, (see Christian et al 1992, J. Mol. Biol. 227:711; Devlin et al, 1990 Science 
249:404; Cwirla et al 1990, Proc. Natl. Acad, Sci. USA, 87:6378). Libraries may also be constructed 
by concurrent synthesis of overlapping peptides (see U.S. Pat. No. 4,708,871). 

Peptides of the invention may be used to identify lead compounds for drug development. 
The structure of the peptides described herein can be readily determined by a number of methods 
such as NMR and X-ray crystallography. A comparison of the structures of peptides similar in 
sequence, but differing in the biological activities they elicit in target molecules can provide 
information about the structure-activity relationship of the target. Information obtained from the 
examination of structure-activity relationships can be used to design either modified peptides, or 
other small molecules or lead compounds which can be tested for predicted properties as related to 
the target molecule. The activity of the lead compounds can be evaluated using assays similar to 
those described herein. 

Information about structure-activity relationships may also be obtained from co- 
crystallization studies. In these studies, a peptide with a desired activity is crystallized in association 
with a target molecule i.e. SAM domain, and the X-ray structure of the complex is determined. The 
structure can then be compared to the structure of the target molecule in its native state, and 
information from such a comparison may be used to design compounds expected to possess desired 
activities. 

The peptides of the invention may be converted into pharmaceutical salts by reacting with 
inorganic acids such as hydrochloric acid, sulfuric acid, hydrobromic acid, phosphoric acid, etc., or 
organic acids such as formic acid, acetic acid, propionic acid, glycolic acid, lactic acid, pyruvic acid, 
oxalic acid, succinic acid, malic acid, tartaric acid, citric acid, benzoic acid, salicylic acid, 
benezenesulfonic acid, and toluenesulfonic acids. The peptides of the invention may be used to 
prepare antibodies. Conventional methods can be used to prepare the antibodies. 

The peptides and antibodies specific for the peptides of the invention may be labelled using 
conventional methods with various enzymes, fluorescent materials, luminescent materials and 
radioactive materials. Suitable enzymes, fluorescent materials, luminescent materials, and radioactive 
material are well known to the skilled artisan. Antibodies and labeled antibodies specific for the 
peptides of the invention may be used to screen for proteins containing SAM domains. 



WO 00/37500 PCT/CA99/01209 

-22- 

Computer modelling techniques known in the art may also be used to observe the interaction 
of a peptide of the invention, and truncations and analogs thereof with a SAM domain (for example, 
Homology Insight II and Discovery available from BioSym/Molecular Simulations, San Diego, 
California, U.S.A.). If computer modelling indicates a strong interaction, the peptide can be 
synthesized and tested for its ability to interfere with SAM domain dimer formation. 
Compositions and Methods of Treatment 

A purified three dimensional SAM domain structure of the invention, the peptides of the 
invention, and the modulators identified using the methods of the invention may be used to modify 
the inappropriate activity of a SAM domain involved in a clinical disorder. They may be used in the 
treatment and diagnosis of disorders associated with aberrant T cell signaling and to modulate 
telomere function. In particular, they may be useful in methods for therapy of cellular senescence and 
immortalization controlled by telomere length and telomerase activity, and as selective 
immunosuppressants (e.g. in organ transplantation). They may also be useful in the treatment of 
cancers, such as melanoma, ocular melanoma, leukemia, astrocytoma, glioblastoma, lymphoma, 
glioma, Hodgkin's lymphoma, multiple myeloma, sarcoma, myosarcoma, cholangiocarcinoma, 
squamous cell carcinoma, CLL, and cancers of the pancreas, breast, brain, prostate, bladder, thyroid, 
ovary, uterus, testis, kidney, stomach, colon and rectum, particularly leukemia including B-cell 
leukemia, T-cell leukemia, null-cell leukemia, myelogenous leukemia, and lymphocytic leukemia, 

Further, the three dimensional SAM domain structure of the invention, the peptides of the 
invention, and the modulators identified using the methods of the invention may be used to modulate 
the biological activity of an Eph receptor or Eph ligand in a cell, including inhibiting or enhancing 
signal transduction activities of the receptor or ligand, and in particular modulating a pathway in a 
cell regulated by the ligand or receptor, particularly those pathways involved in neuronal 
development, axonal migration, pathfinding and regeneration. The three dimensional SAM domain 
structure of the invention, the peptides of the invention, and modulators identified using the methods 
of the invention will be useful as pharmaceuticals to modulate axonogenesis, nerve cell interactions 
and regeneration, to treat conditions such as neurodegenerative diseases and conditions involving 
trauma and injury to the nervous system, for example Alzheimer's disease, Parkinson's disease, 
Huntington's disease, demyelinating diseases, such as multiple sclerosis, amyotrophic lateral 
sclerosis, bacterial and viral infections of the nervous system, deficiency diseases, such as Wernicke's 
disease and nutritional polyneuropathy, progressive supranuclear palsy, Shy Drager's syndrome, 
multistem degeneration and olivo ponto cerebellar atrophy, peripheral nerve damage, and trauma and 
ischemia resulting from stroke. 

The present invention thus provides a method for treating cancer (e.g. leukemia), and 
disorders associated with T cell signaling, modulating telomere function, or affecting neuronal 
development or regeneration, in a subject comprising administering to a subject an effective amount 
of a three dimensional SAM domain structure of the invention, a peptide of the invention, or a 
modulator identified using the methods of the invention. The invention also contemplates a method 
for stimulating or inhibiting axonogenesis in a subject comprising administering to a subject an 
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effective amount of a three dimensional SAM domain structure of the invention, a peptide of the 
invention, or a modulator identified using the methods of the invention. 

The invention still further relates to a pharmaceutical composition which comprises a 
purified three dimensional SAM domain structure of the invention, a peptide of the invention, or a 
modulator identified using the methods of the invention, and a pharmaceutically acceptable carrier, 
diluent or excipient. The pharmaceutical compositions may be used to stimulate or inhibit neuronal 
development, regeneration and axonal migration associated with neurodegenerative conditions, and 
conditions involving trauma and injury to the nervous system. They may also be used to treat cancer 
and disorders associated with T cell signaling, and modulate telomere function. 

The compositions of the invention are administered to subjects in a biologically compatible 
form suitable for pharmaceutical administration in vivo. By "biologically compatible form suitable 
for administration in v/vo" is meant a form of the protein to be administered in which any toxic 
effects are outweighed by the therapeutic effects of the protein. The term subject is intended to 
include mammals and includes humans, dogs, cats, mice, rats, and transgenic species thereof. 
Administration of a therapeutically active amount of the pharmaceutical compositions of the present 
invention is defined as an amount effective, at dosages and for periods of time necessary to achieve 
the desired result. For example, a therapeutically active amount of a three dimensional SAM domain 
structure of the invention, peptides of the invention, or modulators of the invention may vary 
according to factors such as the condition, age, sex, and weight of the individual. Dosage regimes 
may be adjusted to provide the optimum therapeutic response. For example, several divided doses 
may be administered daily or the dose may be proportionally reduced as indicated by the exigencies 
of the therapeutic situation. 

The active compound may be administered in a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation, transdermal application, or 
intracerebral administration. In particular embodiments, pharmaceutical compositions of the 
invention are administered directly to the peripheral or central nervous system, for example by 
administration intracerebrally. 

A pharmaceutical composition of the invention can be administered to a subject in an 
appropriate carrier or diluent, co-administered with enzyme inhibitors or in an appropriate carrier 
such as microporous or solid beads or liposomes. The term "pharmaceutically acceptable carrier" as 
used herein is intended to include diluents such as saline and aqueous buffer solutions. Liposomes 
include water-in-oil-in-water emulsions as well as conventional liposomes (Strejan et al., (1984) J. 
Neuroimmunol 7:27). The active compound may also be administered parenterally or 
intraperitoneally. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and 
mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations may 
contain a preservative to prevent the growth of microorganisms. Depending on the route of 
administration, the active compound may be coated to protect the compound from the action of 
enzymes, acids and other natural conditions which may inactivate the compound. 
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The pharmaceutical compositions may be administered locally to stimulate axonogenesis 
and pathfinding, for example the compositions may be administered in areas of local nerve injury or 
in areas where normal nerve pathway development has not occurred. The pharmaceutical 
compositions may also be placed in a specific orientation or alignment along a presumptive pathway 
5 to stimulate axon pathfinding along that line, for example the pharmaceutical compositions may be 
incorporated on microcarriers laid down along the pathway. In particular, the pharmaceutical 
compositions of the invention may be used to stimulate formation of connections between areas of 
the brain, such as between the two hemispheres or between the thalamus and ventral midbrain. The 
pharmaceutical compositions may be used to stimulate formation of the medial tract of the anterior 
1 0 commissure or the habenular interpeduncle. 

Therapeutic administration of polypeptides may also be accomplished using gene therapy, A 
nucleic acid including a promoter operatively linked to a heterologous polypeptide may be used to 
produce high-level expression of the polypeptide in cells transfected with the nucleic acid. DNA or 
isolated nucleic acids may be introduced into cells of a subject by conventional nucleic acid delivery 
15 systems. Suitable delivery systems include liposomes, naked DNA, and receptor-mediated delivery 
systems, and viral vectors such as retroviruses, herpes viruses, and adenoviruses. 

The following non-limiting example is illustrative of the present invention: 

EXAMPLE 

The following methods were used to determine the crystal structure of the SAM domain of 

20 the Eph receptor isoform A4. 

Protein expression, mutagenesis and purification: The SAM domain of the Eph receptor isoform 
A4 (residues 890 to 981) was expressed in E. coii as a GST fusion protein using the pGEX-2T vector 
(Pharmacia). The Quickchange kit (Stratagene) was used to generate site directed mutants for 
dimerization analysis and for heavy atom phasing. Protein was purified by affinity chromatography 

25 using glutathione Sepharose beads (Pharmacia). Bound protein was eluted by cleavage with 
thrombin. After concentrating to 10 mM, protein was applied to a Superdex 75 gel filtration column 
(Pharmacia) for final purification and characterization. 

Crystallization and data collection: Hanging drops containing 1 ul of 100 mg/ml native or mutant 
(GIu 941 Cs) protein in 7mM Hepes pH 7.5 were mixed with equal volumes of reservoir buffer 

30 containing 100 mM cacodylate pH 6.5, 7% (w/v) PEG 8000, and 20% (v/v) ethylene glycol. Rod like 
crystals of approximate dimensions 0.05 x 0.05 x 0.2 mm were obtained overnight. The crystals 
contain one molecule of the EphA4 SAM domain per asymmetric unit, and belong to the space group 
P6 4 , (a = b = 77.14 A, c = 24.37 A). The solution dimer corresponds to a crystallographic dimer 
generated from the asymetric unit by a two fold rotation parallel to the unique crystal axis. Crystals 

35 were cryo-protected in reservoir buffer enriched to 20% (w/v) PEG 8000 and 20% (v/v) ethylene 
glycol prior to stream freezing. Heavy atom derivatives were prepared by soaking crystals overnight 
in 1-10 mM heavy atom solution prepared in cryo-protection buffer. 
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Native and derivative diffraction data were collected on frozen crystals (108°K) using a 
Marr 345 imaging plate detector system with an RU200 rotating anode generator (Table 1). Data 
processing and reduction was carried out with the HKL, DENZO, and SCALEPACK programs. 

Single isomorphous replacement (SIR) protein phases were calculated using lead derivative 
5 data collected on two separate protein crystals. The heavy atom site was identified by the Patterson 
search program HASSP [Terwilliger, 1987]. A Glu 941 to Cys site directed mutant of the EphA4 
SAM domain construct was employed for mercury derivatization. The heavy atom position of the 
mercury derivative data, which was collected on three separate crystals, was identified by difference 
Fourier synthesis. Multiple isomorphous replacement and anomalous scattering (MIRAS) phases, 
10 using only the lead derivative anomalous signal, were calculated and iterative rounds of automatic 
solvent boundary determination/density modification were performed using the PHASES package 
[Furey, 1990]. The resultant experimental electron density map allowed for the complete tracing of 
the SAM domain backbone structure. 

Model building and Refinement: Model building was performed using O [Jones, 1991]. A starting 
15 model comprising approximately 65% of the total structure was refined using XPLOR [Brimger, 
1992]. Bulk solvent correction was applied during refinement and simulated annealing protocols were 
employed. The remaining structure was built into 2F 0 -F C electron density maps generated with 
XPLOR. The final refinement statistics are shown in Table I. The first 20 residues of the SAM 
domain construct are disordered (residues 890 to 909) and have not been modeled. No amino acid 
20 residues occupy disallowed regions of the Ramachandran plot and 94 % occupy the most favored 
regions. 
Results: 

The X-ray crystal structure of the SAM domain from the EphA4 receptor tyrosine kinase 
(Table 1 and 2) was determined. The boundaries of the structure were defined by limited proteolysis 

25 and mass-spectrometry. Overall, the structure of the homodimer is oblong and arises from the 
association of two 'lobster claw' shaped subunits. Each subunit possesses a globular fold consisting 
of an N-terminal extended strand segment, followed by four short a helices (al to a4) and one long 
C-terminal helix oc5 (Figure 2A, 2B, and 2C). The N- and the C-termini are located on one side of 
the subunit fold, similar to other protein interaction modules with signaling function (SH3, SH2, PH 

30 domains etc.) [Kuriyan, 1997]. However, in contrast to these other domains, the termini compose the 
functional end of the molecule rather than lying opposite to the ligand-binding surface. As shown in 
Figure 3A and 3B, the N-terminal strand region and the C-terminal helix ct5 extend from the subunit 
core and interdigitate in a pincer like manner with the termini of a second subunit, to form an 
elaborate dimer interface. In addition to the N- and C-terminal regions, a-helices al and ct3 

35 contribute side chains to the dimer interface. 

The N-terminal strands cross in an anti-parallel manner and project the side chains of Ala 
912, Val 913, Val 914 and Phe 910 downward to form one mandible of the 'lobster claw* shaped 
subunit. The C-terminal helices cc5 also cross in an anti-parallel manner with each ct-helix projecting 
the side chains of Met 972, Met 976, and Met 979, upwards to form the second mandible. Together 
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these side chains compose a hydrophobic core that is fully continuous with those of the individual 
subunits. Residues bridging the subunit and interface cores include Trp 919, Ala 922 and He 923 from 
helix ocl and Leu 940 and Val 944 from helix ct3. Complementing these hydrophobic interactions, 
the conserved side chain of arginine 973 forms intermolecular electrostatic interactions with the free 
carboxylate of glycine 981 and a stabilizing charge/helix dipole interaction with the C-terminus of 
helix ct5 (Figure 2C). Additional polar residues located at or in close proximity to the dimer interface 
include His 980, Gin 975, His 945, Gin 977, Glu 941 and Ser 91 1. 

In order to identify determinants of dimerization and to test that the crystallographic dimer 
model reflects the solution structure of the EphA4 SAM domain, SAM domain residues, either singly 
or in combination, were substituted and the behaviour of these mutants was tested using size 
exclusion chromatography (Figure 4). In agreement with predictions from the crystal structure, 
mutations involving the interface residues Val 913, Val 914, Met 972, Met 976, Met 979, Val 944, 
and Leu 940 abolished dimer formation. In contrast, mutation of Val 969 to Ala, which comprises 
part of the second hydrophobic surface region (Figure 3 A and 3B), did not affect dimerization while 
mutation of the proximal residue He 959 to Lys, appeared to disrupt the integrity of the subunit fold. 
Additionally, mutation of the surface exposed residues Glu 941, Asp 949, and Ser 968 to cysteine, 
did not disrupt SAM domain dimerization. In summary, the mutagenesis results are consistent with 
and support the notion that the SAM domain dimer observed in the crystal structure represents a 
mechanism through which the SAM domain associates in solution. 

To investigate whether the dimer model for the Eph receptor SAM domain has more general 
relevance for SAM domain containing proteins, the predicted locations of residues that are required 
for the dimerization of SAM domains on other polypeptides were examined. When mutations that 
map to conserved features of the subunit core and therefore are likely to disrupt the subunit fold are 
eliminated, a number of informative mutations stand out. For example, the homo- and hetero-typic 
dimerization of the Polycomb family of transcriptional repressors ph, RAE28 and Scm, is abolished 
by mutation of two residues predicted to map to the dimer interface [Kyba, 1998]. These residues, He 
62 and Trp 1 of the ph SAM domain, correspond to the N-terminal strand residue Phe 910 and the a5 
helix residue Met 972, respectively, of the EphA4 SAM domain. Both residues are highly conserved 
amongst the SAM domains and yet are unlikely to affect the individual subunit fold. The mutation of 
the latter residue (Met 972 to Lys) in the EphA4 SAM domain yields a compact monomer structure 
(Figure 4). In addition, the hetero-dimerization of the SAM domain containing proteins Byr2p and 
Ste4p is disrupted by the substitution of Arg 69 with cysteine[Tu, 1997 #25]. This mutation maps to 
the interface residue Gin 977 of the EphA4 SAM dimer, and is located at the crossing site of the two 
a5 helices. Taken together, these observations indicate that the dimer structure of the EphA4 SAM 
domain may reflect a more general mode of SAM domain dimerization. 

The crystallographic model for SAM domain dimerization is attractive for a number of 
reasons. Firstly, in the case of the Eph receptors, the linkers between the SAM and the catalytic 
domains is short (5 residues of poorly conserved sequence) so that the N-termini of the dimer would 
have to be oriented in the same direction and in close proximity if the kinase domains of clustered 
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receptors were to be juxtaposed. The structure shows this to be the case. Secondly, the mechanism 
of dimerization revealed by the structure could account for the observation that the SAM domain is 
found at either terminus of signaling proteins. Because the N- and C-terminal ends of the SAM 
domain compose the dimer interface, the insertion of a SAM domain at an internal site in a 
polypeptide chain would sterically restrict access to a second SAM domain, especially if the host 
sequence was itself structured. The solutions to this dilemma would be to place a SAM domain at the 
end of a protein (as is usually observed), or to surround it with long linker sequences. In this regard 
the SAM domain differs from modules such as SH2 and SH3 domains, which can readily be located 
at internal positions in a polypeptide chain since the ligand-binding site is located opposite to the 
location of the N- and C-termini [Kuriyan, 1997]. Thirdly, in the case of the liprins we have noted 
three adjacent SAM domains in a region previously shown to mediate liprin hetero-dimerization 
[Serra-Pages, 1998]. Because the C-termini of the dimerized SAM domain are in close proximity, on 
the opposite side from the N-termini, a configuration of stacked SAM domains can be readily 
envisioned. 

SAM dimerization may contribute to receptor oligomerization and activation by bringing 
catalytic elements into proximity for autophosphorylation. The SAM domain may have a direct 
inhibitory interaction with the kinase domain that can be competed away by dimerization. 
Alternatively SAM domain mediated dimerization might maintain opposing catalytic domains in a 
mutually inaccessible, and thus repressed state. The Eph SAM domains might also recruit signaling 
partners through heteromeric SAM-SAM interactions, or through specific recognition of cytoplasmic 
proteins by the Eph SAM dimer. 

SAM dimerization might be constitutive, but controlled through co-operative or antagonistic 
interactions with other clustering forces. Dimerization could potentially be controlled by 
modifications such as tyrosine phosphorylation, and indeed a residue within the SAM domain of the 
EphBl receptor can become tyrosine phosphorylated in vivo [Stein, 1996]. Finally, the five residues 
that lie C-terminal to the Eph SAM domain represent a potential binding site for PDZ domain 
proteins[Hock, 1998], which might influence the organization of the SAM domain. 

The structure of the EphA4 domain reveals a novel mechanism through which modular 
domains control protein-protein interactions. Since SAM domains are found in cell surface receptors, 
cytoplasmic signaling proteins, and transcriptional activators and repressors, as well as chimeric 

human oncoproteins, these results have general implications for understanding the formation of 

complexes involved in normal and oncogenic signal transduction. 

Having illustrated and described the principles of the invention in a preferred embodiment, it 

should be appreciated to those skilled in the art that the invention can be modified in arrangement and 

detail without departure from such principles. All modifications coming within the scope of the 

following claims are claimed. 

All publications, patents and patent applications referred to herein are incorporated by 

reference in their entirety to the same extent as if each individual publication, patent or patent 

application was specifically and individually indicated to be incorporated by reference in its entirety. 
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Detailed Description of the Drawings 

Figure 1A shows a sequence alignment of SAM domains from selected proteins. Secondary 
structure is indicated for the SAM domain from the EphA4 receptor tyrosine kinase. Residue 
numbers for the start of each SAM domain are shown on the left and Genebank accession numbers on 
the right. Conserved hydrophobic residues are colored green, acidic residues red, basic residues 
blue, polar residues orange and glycines are colored pink. Residues at the dimer interface shown in 
Figure 2C are indicated (•). Liprin ctl contains 3 SAM domains designated SI, S2 and S3. 

Figure IB shows a selection of multi-domain proteins containing SAM domain (S) is shown. 
Domains listed include, tyrosine or serine/threonine kinase catalytic domains, myosin-like domain, F- 
actin binding domain (F-actin BD), PDZ domain, SH2 domain, inositol phosphatase catalytic domain 
(inositol p' tase), GTPase activating domain (GAP), DNA-binding domain (DNA-BD) and a 
transmembrane region (TM). 

Figure 2A, 2B, and 2C. Ribbons depiction of the SAM homo-dimer viewed (Figure 2A) 
down the twofold symmetry axis and (Figure 2B) perpendicular to the symmetry axis. The dimer 
subunits are coloured red and blue and ct-helices are labeled. (Figure 2C) Ribbons stereo view 
highlighting the dimer interface region. Aromatic, aliphatic, methionine, histidine and arginine 
interacting side chains are coloured light blue, green, yellow, orange, and blue (see Figure 1A for 
residue identification). All ribbon diagrams were generated using RIBBONS [Carson, 1991]. 

Figure 3A, B. Molecular surface and worm representations of the SAM homodimer. The 
molecular surface of one subunit is shown with hydrophobic (Met, Val, Leu, He, Phe,), basic (Arg, 
Lys) and acidic (Glu, Asp) side chains coloured green, blue and red, respectively. The two 
perspectives differ by a 90° rotation about the vertical axis. In Figure 3B the twofold rotation axis 
relating the two subunits of the dimer is shown. The buried surface area of the dimer interface is 
1923 A. All molecular surfaces were generated using GRASP [Nicholls, 1991]. 

Figure 4. Gel filtration elution profile of wild type and single or double site mutants of the 
EphA4 receptor SAM domain. Chromatograms correspond to the loading of equivalent 
concentrations (10 mM) and total volumes (100 ul) of protein on a Superdex-75 gel filtration column 
(24 ml bed volume). The column was calibrated using Pharmacia low molecular weight standards. 
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19.224 26.857 16.827 1.00 23.17 

18.136 27.348 17.596 1.00 24.34 

17.398 26.778 17.408 1.00 10.00 

17.722 26.379 14.905 1.00 23.08 

16.632 26.961 14.867 1.00 21.72 

17.879 25.083 14.636 1.00 22.36 

18.769 24.688 14.711 1.00 10.00 
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1. A purified three dimensional structure of a polypeptide corresponding to one or more SAM 
domains. 

2. A three dimensional structure as claimed in claim 1 , wherein the SAM domain is a SAM domain 
of an Eph receptor. 

3. A three dimensional structure as claimed in claim 2 wherein the Eph receptor is EphA. 

4. A three dimensional structure as claimed in claim 1 complexed with one or more compounds. 

5. A three dimensional structure as claimed in claim 1 comprising one or more heavy metal atoms. 

6. A purified crystalline form of a polypeptide corresponding to one or more SAM domains. 

7. A crystalline form as claimed in claim 6 having dimensions of about a=b= 77.14 ± .03 
angstroms, c= 24.3 ± .04 angstroms. 

8. A crystalline form as claimed in claim 7 having the co-ordinates set out in Table 2. 

9. A method of forming a crystalline form as claimed in claim 6 comprising 

(a) mixing a volume of a SAM domain with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed container 
under conditions suitable for crystallization. 

10. A method of determining three dimensional structures of polypeptides with SAM domains of 
unknown structure comprising the step of applying the structural atomic coordinates of a three 
dimensional structure as claimed in claim 1 or a crystalline form as claimed in claim 7 or 8. 

1 1. A method for identifying a potential modulator of a SAM domain of an Eph receptor function 
comprising docking a computer representation of a structure of a compound with a computer 
representation of a structure of one or more SAM domains of an Eph receptor that is defined by 
the atomic structural coordinates of the three dimensional structure as claimed in claim 2 or a 
crystalline form as claimed in claim 7 or 8. 

12. A method as claimed in claim 1 1 comprising the following steps: 

(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected site on a three dimensional structure of a SAM domain 
of an Eph receptor as claimed in claim 2 or a crystalline form as claimed in claim 7 or 8 to 
obtain a complex; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of SAM domain 
function. 

13. A method as claimed in claim 1 1, comprising the following steps: 

(a) modifying a computer representation of a compound complexed with a selected site on 
a three dimensional structure of a SAM domain of an Eph receptor as claimed in claim 
2 or a crystalline form as claimed in claim 7 or 8, by deleting or adding a chemical 
group or groups; 
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(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying a compound that best fits the selected site as a potential modulator of a SAM 
domain. 

14. A method as claimed in claim 1 1 comprising the following steps: 

(a) selecting a computer representation of a compound complexed with a selected site on a three 
dimensional structure of a SAM domain of an Eph receptor as claimed in claim 2 or a 
crystalline form as claimed in claim 7 or 8; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
computer program, or replacing portions of the compound with similar chemical structures 
from a data base using a compound building computer program. 

15. A potential modulator of a function of a SAM domain of an Eph receptor identified by a method 
as claimed in any one of claims 1 1 to 14. 

16. A method of treating a disease associated with a SAM domain of an Eph receptor with 
inappropriate activity in a cellular organism, comprising: 

(a) administering a crystalline form of a polypeptide as claimed in claim 6 or a modulator 
identified using a method as claimed in any one of claims 11 to 14, in an acceptable 
pharmaceutical preparation; and 

(b) activating or inhibiting a SAM domain function to treat the disease. 

17. A method as claimed in claim 16 wherein the disease is a cell proliferative disease or disease 
associated with the nervous system. 

18. A peptide of the formula I which mediates SAM domain function: 

X-X ! -X 2 -X 3 -X 4 -X 5 -X 6 I 

wherein X and X 6 represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 
amino acids, and X 1 represents Leu, Phe, Asp, Ala, Glu, or Gly, preferably Leu or Gly, X 2 
represents Glu, Asp, Ser, He, Ala, Arg, Lys, and Gin, preferably Glu or Asp, X 3 represents Ala, 
Val, Glu, Phe, Ser, He, Met, Leu, His, Gin, Arg, or Asp preferably Ala, Val, or Phe, X 4 is Val, 
Leu, Met, Phe, and He, preferably Val or Leu, or Phe, X 5 is Val, Ser, Leu, Asp, Ala, Pro, Asn, 
Lys, or Cys, preferably Val or Ser. 

19. A peptide as claimed in claim 18 wherein X represents TT, ID, TS, DD, GYTT (SEQ ID. NO. 
38), AAGYTT (SEQ ID. NO. 39), FTAAGYTT (SEQ ID. NO. 40), DNFTAAGYTT (SEQ ID. 
NO. 41), or YKDNFTAAGYTT (SEQ ID. NO. 42). 

20. A peptide as claimed in claim 18 wherein X 6 represents HM, HMSQ (SEQ ID. NO. 43), 
HMSQD (SEQ ID. NO. 44), HMSQDD (SEQ ID. NO. 45), HMSQDDLA (SEQ ID. NO. 46), 
QMMM (SEQ ID. NO. 47), QMMMED (SEQ ID. NO. 48), QMMMEDLL (SEQ ID. NO. 49), 
DITE (SEQ ID. NO. 50), DITEED (SEQ ID. NO. 51), DITEEDL (SEQ ID. NO. 52), NLTE 
(SEQ ID. NO. 53), NLTEND (SEQ ID. NO. 54), or NLTENDI (SEQ ID. NO. 55). 
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21. A peptide of the formula I as claimed in claim 18 which is LEAVV (SEQ ID. NO. 56), 
TTLEAVV (SEQ ID. NO. 57), LEAVVHM (SEQ ID. NO. 58), LEAVVHMSQ (SEQ ID. NO. 
59), LEAVVHMSQD (SEQ ID. NO. 60), LEAVVHMSQDDL (SEQ ID. NO. 61), 
LEAVVHMSQDDLAR (SEQ ID. NO. 62), TTLEAVVHMS (SEQ ID. NO. 63), 
TTLEAVVHMSQD (SEQ ID. NO. 64), TTLEAVVHMSQDDL (SEQ ID. NO. 65), 
TTLEAVVHMSQDDLAR (SEQ ID. NO. 66), GYTTLEAVV (SEQ ID. NO. 67), 
GYTTLEAVVHMS (SEQ ID. NO. 68), GYTTLEAVVHMSQD (SEQ ID. NO. 69), 
G YTTLE A V VHM SQDDL (SEQ ID. NO. 70), GYTTLEAVVHMSQDDLAR (SEQ ID. NO. 
71), FDVVS (SEQ ID. NO. 72), FDVVSQ (SEQ ID. NO. 73), FDVVSQMM (SEQ ID. NO. 74), 
FDVVSQMMME (SEQ ID. NO. 75), FDVVSQMMMEDIL (SEQ ID. NO. 76), TSFDWS 
(SEQ ID. NO. 77), TSFDVVSQ (SEQ ID. NO. 78), TSFDVVSQMM (SEQ ID. NO. 79), 
TSFDVVSQMMME (SEQ ID. NO. 80), TSFDVVSQMMMEDIL (SEQ ID. NO. 81), LEFLS 
(SEQ ID. NO. 82), LEFLSD (SEQ ID. NO. 83), LEFLSDIT (SEQ ID. NO. 84), LEFLSDITEE 
(SEQ ID. NO. 85), LEFLSDITEEDL (SEQ ID. NO. 86), DDLEFLS (SEQ ID. NO. 87), 
GWDDLEFLS (SEQ ID. NO. 88), DDLEFLSD (SEQ ID. NO. 89), DDLEFLSDIT (SEQ ID. 
NO. 90), DDLEFLSDITEE (SEQ ID. NO. 91), DDLEFLSDITEEDL (SEQ ID. NO. 92), 
GARFL (SEQ ID. NO. 93), GARFLN (SEQ ID. NO. 94), GARFLNLT (SEQ ID. NO. 95), 
GARFLNLTEN (SEQ ID. NO. 96), and IDGARFL (SEQ ID. NO. 97). 

22. A peptide of the formula II which mediates SAM domain function: 

X 7 -X 8 -X 9 -X ,0 -X n -X ,2 -X I3 -X ,4 -X !5 -X ,6 II 

wherein X 7 and X 16 represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 
amino acids, and X 8 represents Met, lie, Ser, Leu, Asn, Phe, or Val, preferably Met, X 9 represents 
Arg, Ser, Lys, Met, Leu, Glu, Gin, or Asn, preferably Gin or Arg, X 10 represents Thr, Ala, Arg, 
Leu, Ser, Glu, Asp, Met, Lys, Gin, or Gly, preferably Thr, Ala, or Glu, X n represents Gin, Ser, 
Glu, Leu, Phe, Asp, Thr, Arg, preferably Gin or Arg, X 12 represents Met, Ala, He, Asn, Ser, Arg, 
Thr, Pro, Leu, Gin, Val, Lys, preferably Met or Arg, X 13 represents Gin, Asn, Pro, Ser, Tyr, Glu, 
Leu, Arg, or Lys, preferably Gin, Asn, or Arg, X 14 represents Gin, Ala, Pro, Asp, Leu, Lys, lie, 
Glu, Arg, or Asn, preferably Gin or He, and X 15 represents Met, He, Val, His, Ser, Arg, Lys, Phe, 
Cys, Glu, Tyr, Ala, He, Trp, or Leu. 

23. A peptide of the formula II as claimed in claim 22 wherein X 7 represents QA, QV, NK, SVQA 
(SEQ ID. NO. 98), LSSVQA (SEQ ID. NO. 99), ILSSVQA (SEQ ID. NO. 100), NKILSSVQA 
(SEQ ID. NO. 101), HQNKILSSVQA (SEQ ID. NO. 102), THQNKILSSVQA (SEQ ID. NO. 
103), ENIK (SEQ ID. NO. 104), SQEINK (SEQ ID. NO. 105), KLSQEINK (SEQ ID. NO. 
106), ILNSIQV (SEQ ID. NO. 107), or NS1QV (SEQ ID. NO. 108). 

24. A peptide of the formula II as claimed in claim 22 wherein X 16 is HG, QS, HGRM (SEQ ID. NO. 
109), HGRM VP (SEQ ID. NO. 1 10), QSVEV (SEQ ID. NO. 1 1 1), or TRKP (SEQ ID. NO. 1 12). 
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25. A peptide of the formula II as claimed in claim 22 which is MRTQMQQM (SEQ ID. NO. 1 13), 
QAMRTQMQQM (SEQ ID. NO. 114), SVQAMRTQMQQM (SEQ ID. NO. 115), 
LSSVQAMRTQMQQM (SEQ ID. NO. 116), ILSSVQAMRTQMQQM (SEQ ID. NO. 117), 
MRTQMQQMHG (SEQ ID. NO. 118), MRTQMQQMHGRM (SEQ ID. NO. 119), 
MRTQMQQMHGRMVPV (SEQ ID. NO. 120), NEERRSIF (SEQ ID. NO. 121), 
INKNEERRSIF (SEQ ID. NO. 122), NEERRSIFTRKP (SEQ ID. NO. 123). MRAQMNQI 
(SEQ ID. NO. 124), MRAQMNQIQS (SEQ ID. NO. 125), MRAQMNQIQSVEV (SEQ ID. NO. 
126). 

26. A peptide which mediates SAM domain function comprising VVSV (SEQ ID. NO. 21), 
SAVVSV (SEQ ID. NO.22), FSAVV (SEQ ID. NO.23 ), FSAWSV (SEQ ID. NO. 24), 
FSAVVSVGD (SEQ ID. NO. 25), VVSVGDWL (SEQ ID. NO. 26), FNTV (SEQ ID. NO. 27), 
FNTVDE (SEQ ID. NO. 28), FNTVDEWL (SEQ ID. NO. 29), TSFNTVDEWL (SEQ ID. NO. 
30), TSFNTV (SEQ ID. NO. 31), YTSFNTV (SEQ ID. NO. 32), RSEV (SEQ ID. NO. 33), 
RSEVLG (SEQ ID. NO. 34), RSEVLGVD (SEQ ID. NO. 35), VPFRSEV (SEQ ID. NO. 36), 
and VPFRSEVLGW (SEQ ID. NO. 37). 

27. A pharmaceutical composition comprising a peptide as claimed in any one of claims 18 to 26 and 
a pharmaceutically acceptable carrier, diluent or excipient. 
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FIGURE IB 
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FIGURE 2C 
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FIGURE 3B 
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SEQ. ID. NO. 1 
EPH A4 

PEFSAVVSVGDWLQAJKMDRYKDNFTAAGYTTI^ 
AMRTQMQQMHGRMVPV 

SEQ. ID. NO. 2 

EPH B2 

PDYTSFNTVDEWLEAIKMGQYKESFANAGFTSFDVVSQMMMEDILRVGVTLA 
QVMRAQMNQIQSVEV 

SEQ. ID. NO. 3 

DGK-delta 

VHLVGTEEVAAWLEHLSLCEYKDIFTRHDIR 
LSRSAPAVEA 

SEQ. ID. NO. 4 

SHIP2 

SGLGEAGMSAWLRAIGI^RYEEGLVHNGWDDI^FLSDITEEDl^AGVQDPAHKRLL^ 
LSIC 

SEQ. ID. NO. 5 
RhoGAPpl22 

LTQmAK£ACDWLRATGFPQYAQLYEDFLFPIDISLVKREHDFLDRDAIEALCRRL^LNKCAV 
MKLEISPHRKRS 

SEQ. ID. NO. 6 

Liprin al-Sl 

QWDGFTVVVWLELWVGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNPLHRLKLRLAI 
QEIMSLTSPSAPPT 

SEQ. ID. NO. 7 

Liprin al-S2 

NHEWIGNEWLPSLGLPQYRSYFMECLVDARMLDHLTKKDLRGQLKM 
LRRLNYDRKELE 

SEQ. ID. NO. 8 

Liprin al-S3 

VLVWSNDRVIRWILSIGLKYANNLffiSGVHGALLALDETFDFSALALLLQIPTQNTOARAVI^R 
EFNNLLVMGT 
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SEQ. ID. NO. 9 
Cortactin-BPl 

VHLWTKPDVADWLESLNLGEHKETFMDNEIDGSHIJ^^ 
QLLDR 

SEQ. ID. NO. 10 
Neurabin 

VHEWSVQQVSHWLVGLSLDQYVSEFSAQNISGEQLXQLDGNKLKALGMTSSODRALVKKKL 
KEMKMSLEKARKAQ 

SEQ. ID. NO. 1 1 

SLP-76 

RNVPFRSEVLGWDPDSLADYFKKLNYDCEKAVKKYHIDGARFLNLTENDIQKFPKLRW 
LSQEINKNEERRSIFTRKP 

SEQ. ID. NO. 12 

Byr2p (S.pombe) 

MEYYTSKEVAEWLKSIGLEKYIEQFSQNNIEGRHLNHLTLPLLKDLGIENTAXGKQFLKQ 
REFPRPCILRF 

SEQ. ID. NO. 13 

Ste4(S.pombe) 

YWNWNNEAVCNWmQLGFPHK£AFEDYHILGKDIDLLSSNDLRDMGIES^ 
KKQKDKLQQE 

SEQ. ID. NO. 14 

Stell (S.cerevisiae) 

EKTNDLPFVQLFl^EIGCTQYLDSHQCNLVTEEEIKYLDKDILIALGVNKIGDRLKILRKSKSFQ 
RDKRIEQVNR 

SEQ. ID. NO. 15 

STE50 (S. cerevisiae) 

FSQWSW)DVITWCISTI^VEETDPLCQRLRENDIVGDIXPELCLQDLCDGDLNKAIKFKILI^ 
MRDSKLEWKDDK 

SEQ. ID. NO. 16 

ETS-1 

PRQWTETHVRDWVMWAVNEFSLKGVDFQKFCMNGAALCALGKDCFLELAPDFVGDILWEH 
LEILQKEDVKPYQVNG 

SEQ. ID. NO. 17 

FLI-1 



WO 00/37500 PCT/CA99/01209 

PTLWTQEHVRQWLEWAIKEYSLMEIDTSFFQNMDGKELCKMNKEDFLRATTLYNTEVLLSHL 
SYLRESSLLAYNTT 

SEQ. ID. NO. 18 

TEL 

PIYWSRDDVAQ\\^KWAENEFSLRPIDSNTFEMNGKALLLLTKEDFRYRSPHSGDVLYELLQHI 
LKQRKPRILFSP 

SEQ. ID. NO. 19 

RAE28 

PSQWSVEEVYEHASLQGCQEIAEEFRSQEIDGQALLIXK£EHLMSAMNIK1.GPALK1CAKIN^ 
KET 

SEQ. ID. NO. 20 
Scm 

ProWTIEEVIQYIESMDNSLAVHGDLFRKHEIDG 
NKVNGRRNNLAL 

SEQ. ID. NO. 21 

VVSV 

SEQ ID. NO. 22 

SAVVSV 

SEQ ID. NO.23 

FSAVV 

SEQ ID. NO.24 

FSAVVSV 

SEQ ID. NO. 25 

FSAWSVGD 

SEQ ID. NO. 26 

VVSVGDWL 

SEQ ID. NO. 27 

FNTV 

SEQ ID. NO. 28 
FNTVDE 
SEQ ID. NO. 29 
FNTVDEWL 
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SEQ ID. NO. 30 

TSFNTVDEWL 

SEQ ID. NO. 31 

TSFNTV 

SEQ ID. NO. 32 

YTSFNTV 

SEQ ID. NO. 33 

RSEV 

SEQ ID. NO. 34 
RSEVLG 
SEQ ID. NO. 35 
RSEVLGWD 
SEQ ID. NO. 36 
VPFRSEV 
SEQ ID. NO. 37 
VPFRSEVLGW 
SEQ ID. NO. 38 
GYTT 

SEQ ID. NO. 39 
AAGYTT 
SEQ ID. NO. 40 
FT AAGYTT 
SEQ ID. NO. 41 
DNFT AAGYTT 
SEQ ID. NO. 42 
YKDNFT AAGYTT 
SEQ ID. NO. 43 
HMSQ 

SEQ ID. NO. 44 

HMSQD 

SEQ ID. NO. 45 

HMSQDD 

SEQ ID. NO. 46 

HMSQDDLA 

SEQ ID. NO. 47 
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QMMM 

SEQ ID. NO. 48 

QMMMED 

SEQ ID. NO. 49 

QMMMEDLL 

SEQ ID. NO. 50 

DITE 

SEQ ID. NO. 51 

DITEED 

SEQ ID. NO. 52 

DITEEDL 

SEQ ID. NO. 53 

NLTE 

SEQ ID. NO. 54 
NLTEND 
SEQ ID. NO. 55 
NLTENDI 
SEQ ID. NO. 56 
LEAVV 
SEQ ID. NO. 57 
TTLEAVV 
SEQ ID. NO. 58 
LEAVVHM 
SEQ ID. NO. 59 
LEAVVHMSQ 
SEQ ID. NO. 60 
LEAVVHMSQD 
SEQ ID. NO. 61 
LEAVVHMSQDDL 
SEQ ID. NO. 62 
LEAVVHMSQDDLAR 
SEQ ID. NO. 63 
TTLEAVVHMS 
SEQ ID. NO. 64 
TTLEAVVHMSQD 
SEQ ID. NO. 65 
TTLEAVVHMSQDDL 
SEQ ID. NO. 66 
TTLEAVVHMSQDDLAR 
SEQ ID. NO. 67 
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GYTTLEAVV 
SEQ ID. NO. 68 
GYTTLEAVVHMS 
SEQ ID. NO. 69 

GYTTLEAVVHMSQD 
SEQ ID. NO. 70 
GYTTLEAVVHMSQDDL 
SEQ ID. NO. 71 
GYTTLEAVVHMSQDDLAR 
SEQ ID. NO. 72 
FDVVS 

SEQ ID. NO. 73 
FDVVSQ 
SEQ ID. NO. 74 
FDVVSQMM 
SEQ ID. NO. 75 
FDVVSQMMME 
SEQ ID. NO. 76 
FDVVSQMMMEDIL 
SEQ ID. NO. 77 
TSFDVVS 
SEQ ID. NO. 78 
TSFDVVSQ 
SEQ ID. NO. 79 
TSFDVVSQMM 
SEQ ID. NO. 80 
TSFDVVSQMMME 
SEQ ID. NO. 81 
TSFDVVSQMM MEDIL 
SEQ ID. NO. 82 
LEELS 

SEQ ID. NO. 83 

LEFLSD 
SEQ ID. NO. 84 

LEFLSDIT 
SEQ ID. NO. 85 
LEFLSDITEE 
SEQ ID. NO. 86 

LEFLSDITEEDL 
SEQ ID. NO. 87 
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DDLEFLS 
SEQ ID. NO. 88 
GWDDLEFLS 
SEQ ID. NO. 89 
DDLEFLSD 
SEQ ID. NO. 90 
DDLEFLSDIT 
SEQ ID. NO. 91 
DDLEFLSDITEE 
SEQ ID. NO. 92 
DDLEFLSDITEEDL 
SEQ ID. NO. 93 
GARFL 

SEQ ID. NO. 94 
GARFLN 
SEQ ID. NO. 95 
GARFLNLT 
SEQ ID. NO. 96 
GARFLNLTEN 
SEQ ID. NO. 97 
IDGARFL 
SEQ ID. NO. 98 
SVQA 

SEQ ID. NO. 99 
LSSVQA 
SEQ ID. NO. 100 
ILSSVQA 
SEQ ID. NO. 101 
NKILSSVQA 
SEQ ID. NO. 102 
HQNKULSSVQA 
SEQ ID. NO. 103 
THQNKILSSVQA 
SEQ ID. NO. 104 
ENIK 

SEQ ID. NO. 105 

SQEINK 
SEQ ID. NO. 106 

KLSQEINK 
SEQ ID. NO. 107 
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ILNSIQV 
SEQ ID. NO. 108 
NSIQV 

SEQ ID. NO. 109 
HGRM 

SEQ ID. NO. 110 
HGRM VP 
SEQ ID. NO. Ill 
QSVEV 

SEQ ID. NO. 1 12 
TRKP 

SEQ ID. NO. 113 
MRTQMQQM 
SEQ ID. NO. 1 14 
QAMRTQMQQM 
SEQ ID. NO. 1 15 
SVQAMRTQMQQM 
SEQ ID. NO. 116 
LSSVQAMRTQMQQM 
SEQ ID. NO. 1 17 
ILSSVQAMRTQMQQM 
SEQ ED. NO. 118 
MRTQMQQMHG 
SEQ ID. NO. 119 
MRTQM QQMHGRM 
SEQ ID. NO. 120 
MRTQMQQMHGRMVPV 
SEQ ID. NO. 121 
NEERRSIF 
SEQ ID. NO. 122 
INKNEERRSIF 
SEQ ID. NO. 123 

NEERRSEFTRKP 
SEQ ID. NO. 124 
MRAQMNQI 
SEQ ID. NO. 125 
MRAQMNQIQS 
SEQ ID. NO. 126 
MRAQMNQIQSVEV 
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3. Q Claims Nos.: 
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